Slice of Data Science 3/24/2021 - Talk From Clare Heinbaugh
Reeva Sachdev
Clare is sophomore and 1693 scholar at William and Mary. Her presentation in the Slice of Data Science lesson was on transfer learning, stacked generalization, and Keras. Throughout the meeting, Clare used many examples of her own work with many of these topics.
The first topic gone over was neural networks. Neural networks are one type of machine learning model that is used to classify data. In a neural network, there is a hidden layer, input layer, and output layer. It’s used to predict things given the inputs and outputs. The example Clare talked about was a neural network on spotify that could predict the true genre of a song given different sonic varialbe inputs. The way to build network is that each row is an observation and expand it into a vector and apply weights and updating weights to predict power of model better. That is called model fitting and it takes a long time. If you want to know how good model, you want to do a train-test-split. That way you can see how accurate it is by applying data you didn’t train the data on. I thought the spotify example was really interesting because its something that pertains to college students because so many of us use it and so many of us listen to music. It was very engaging.
When working with neural networks, there is a lot of linear algebra included. There is a python library called Keras. The Keras python library does linear algebra and is very fast. It also does vectorized operations. And with Neural networks there is an input which passes through the hidden layer and do matrix operations.
Within neural networks, convolutional neural networks was talked about. Convolutional neural networks is a class of neural networks which is most commonly applied to analyzing visual imagery. It tries to identify relevant features of the image to assign and then assign importance weights. We revolve a filter over an image in convolutional neural networks. The example Clare used was of an image of a tomato. With this, you can take and make the photo of the tomato into a vector because its made up of color bands and identify important features of the image and apply weights to the features. The photo could also be split by the colors in the photo of the tomato split it blue, red, and green color bands. The top box of the tomato have any red which is why there was a 0 in the vector matrix.
Moving on to transfer learning, transfer learning is used to allow us to take weights that have already been trained to recognize images and apply them to our own problem. Then apply them and use them to new images. Its basically a machine learning technique for re-using a pre-trained model on a new problem. The example Clare used was predicting road quality from satellite imagery. The RoadRunner lab at William & Mary was trying to do this with images from satellites and phone data. I thought that was impressive because students actually went out and drove on the roads and recoreded the road quality. Using the true quality of the road, they tried to fit a model using the satelite images of the road to the actual road quality from the app.
There are advantages of using pre-trained model like, for example, you can train your model much faster because you’re using weights that have been determined for other images. So, for example, when you are trying to make predictions, you want to have you want to try and get your weights to ones that produce the best classifications as quickly as possible. That model-fitting process is really difficult. So if you used weights that are pretty close to what you want them to be, you already have an advantage to solving the problem when using weights already made.
There are 3 strategies when deciding what layers to freeze. In the example Clare used, if data set is completely different from the data set trained on, it would be best to use Strategy 1. The person will go through and adjust all of the weights becuase none of them are frozen. Strategy 1 is train the enitre model. Strategy 2 is trian some layers and leave the others frozen. Strategy 3 is freeeze the convolutional base. But the problem of overfitting can occur if we don’t have a lot of data. Overfittiing is when the model is very good at predicting training but bad at testing data because it’s so well-trained to the training data.
Lastly, stacked generalization was talked about. Stacked generalization is a way to improve a model because you take input from many models that use many features and accept different votes or outputs from several models, for the label of the image. It is also possible to learn to trust a specific model and trust that outcome over others. But overall, one uses predictions into inputs as model that predicts. It’s a way to improve your model because you’re not just using one model in order to make a prediction, but you’re actually taking input from multiple models that may be picking up on different features in different ways in order to determine what that image is.
There was also a small plug for the Developers Student Club where one can join to implement their own script. I’m thinking that I might join this club especially because I believe it’ll give me more knowledge about data science overall.
Overall, I thought the talk was really informative. I was given good insight on different future topics I am likely to learn in future Data Science classes along with practical applications from real examples. There is definitely a lot of complexity to the topics moving forward, whether it be coding-wise or concept-wise. I have to admit, I was a bit confused with some of this because I didn’t know what the topic was exactly. But still, the talk was really interesting and I’m likely to go to another meeting. As I keep going to them, I feel like I’ll understand more of it because after this meeting I had to look up a few of the terms in order to fully understand them.