Articles, Blog

Hello World – Machine Learning Recipes #1

September 2, 2019


[MUSIC PLAYING] Six lines of code
is all it takes to write your first
Machine Learning program. My name’s Josh
Gordon, and today I’ll walk you through writing Hello
World for Machine learning. In the first few
episodes of the series, we’ll teach you how to
get started with Machine Learning from scratch. To do that, we’ll work with
two open source libraries, scikit-learn and TensorFlow. We’ll see scikit in
action in a minute. But first, let’s talk quickly
about what Machine Learning is and why it’s important. You can think of Machine
Learning as a subfield of artificial intelligence. Early AI programs typically
excelled at just one thing. For example, Deep
Blue could play chess at a championship level,
but that’s all it could do. Today we want to
write one program that can solve many problems without
needing to be rewritten. AlphaGo is a great
example of that. As we speak, it’s competing
in the World Go Championship. But similar software can also
learn to play Atari games. Machine Learning is what
makes that possible. It’s the study of
algorithms that learn from examples
and experience instead of relying
on hard-coded rules. So that’s the state-of-the-art. But here’s a much
simpler example we’ll start coding up today. I’ll give you a problem
that sounds easy but is impossible to solve
without Machine Learning. Can you write code to
tell the difference between an apple and an orange? Imagine I asked you to write
a program that takes an image file as input,
does some analysis, and outputs the types of fruit. How can you solve this? You’d have to start by
writing lots of manual rules. For example, you
could write code to count how many orange pixels
there are and compare that to the number of green ones. The ratio should give you a
hint about the type of fruit. That works fine for
simple images like these. But as you dive deeper
into the problem, you’ll find the real world
is messy, and the rules you write start to break. How would you write code to
handle black-and-white photos or images with no apples
or oranges in them at all? In fact, for just about
any rule you write, I can find an image
where it won’t work. You’d need to write
tons of rules, and that’s just to tell the
difference between apples and oranges. If I gave you a new problem, you
need to start all over again. Clearly, we need
something better. To solve this, we
need an algorithm that can figure out
the rules for us, so we don’t have to
write them by hand. And for that, we’re going
to train a classifier. For now you can think of a
classifier as a function. It takes some data as input
and assigns a label to it as output. For example, I
could have a picture and want to classify it
as an apple or an orange. Or I have an email, and
I want to classify it as spam or not spam. The technique to
write the classifier automatically is called
supervised learning. It begins with examples of
the problem you want to solve. To code this up, we’ll
work with scikit-learn. Here, I’ll download and
install the library. There are a couple
different ways to do that. But for me, the easiest
has been to use Anaconda. This makes it easy to get
all the dependencies set up and works well cross-platform. With the magic of
video, I’ll fast forward through downloading
and installing it. Once it’s installed,
you can test that everything is
working properly by starting a Python script
and importing SK learn. Assuming that worked, that’s
line one of our program down, five to go. To use supervised
learning, we’ll follow a recipe with
a few standard steps. Step one is to
collect training data. These are examples of the
problem we want to solve. For our problem, we’re
going to write a function to classify a piece of fruit. For starters, it will take
a description of the fruit as input and
predict whether it’s an apple or an orange as
output, based on features like its weight and texture. To collect our
training data, imagine we head out to an orchard. We’ll look at different
apples and oranges and write down measurements
that describe them in a table. In Machine Learning
these measurements are called features. To keep things simple,
here we’ve used just two– how much each fruit weighs in
grams and its texture, which can be bumpy or smooth. A good feature makes
it easy to discriminate between different
types of fruit. Each row in our training
data is an example. It describes one piece of fruit. The last column is
called the label. It identifies what type
of fruit is in each row, and there are just
two possibilities– apples and oranges. The whole table is
our training data. Think of these as
all the examples we want the classifier
to learn from. The more training data you
have, the better a classifier you can create. Now let’s write down our
training data in code. We’ll use two variables–
features and labels. Features contains the
first two columns, and labels contains the last. You can think of
features as the input to the classifier and labels
as the output we want. I’m going to change the
variable types of all features to ints instead of strings,
so I’ll use 0 for bumpy and 1 for smooth. I’ll do the same for our
labels, so I’ll use 0 for apple and 1 for orange. These are lines two and
three in our program. Step two in our recipes to
use these examples to train a classifier. The type of classifier
we’ll start with is called a decision tree. We’ll dive into
the details of how these work in a future episode. But for now, it’s OK to think of
a classifier as a box of rules. That’s because there are many
different types of classifier, but the input and output
type is always the same. I’m going to import the tree. Then on line four of our script,
we’ll create the classifier. At this point, it’s just
an empty box of rules. It doesn’t know anything
about apples and oranges yet. To train it, we’ll need
a learning algorithm. If a classifier
is a box of rules, then you can think of
the learning algorithm as the procedure
that creates them. It does that by finding
patterns in your training data. For example, it might notice
oranges tend to weigh more, so it’ll create a rule saying
that the heavier fruit is, the more likely it
is to be an orange. In scikit, the
training algorithm is included in the classifier
object, and it’s called Fit. You can think of Fit as being
a synonym for “find patterns in data.” We’ll get into
the details of how this happens under the
hood in a future episode. At this point, we have
a trained classifier. So let’s take it for a spin and
use it to classify a new fruit. The input to the classifier is
the features for a new example. Let’s say the fruit
we want to classify is 150 grams and bumpy. The output will be 0 if it’s an
apple or 1 if it’s an orange. Before we hit Enter and see
what the classifier predicts, let’s think for a sec. If you had to guess, what would
you say the output should be? To figure that out, compare
this fruit to our training data. It looks like it’s
similar to an orange because it’s heavy and bumpy. That’s what I’d guess
anyway, and if we hit Enter, it’s what our classifier
predicts as well. If everything
worked for you, then that’s it for your first
Machine Learning program. You can create a new
classifier for a new problem just by changing
the training data. That makes this approach
far more reusable than writing new rules
for each problem. Now, you might be wondering
why we described our fruit using a table of features
instead of using pictures of the fruit as training data. Well, you can use
pictures, and we’ll get to that in a future episode. But, as you’ll see later
on, the way we did it here is more general. The neat thing is that
programming with Machine Learning isn’t hard. But to get it right,
you need to understand a few important concepts. I’ll start walking you through
those in the next few episodes. Thanks very much for watching,
and I’ll see you then. [MUSIC PLAYING]

100 Comments

  • Reply rudrakshya1 January 14, 2019 at 2:34 am

    I have completed Coursera, udemy machine learning A-Z. Udacity course.
    But this is a
    Best getting started I have ever seen for a developer.

  • Reply anim x January 15, 2019 at 1:06 pm

    this is how i did it and its working
    but make sure u are using python 3
    and also have installed scikit-learn

    import sklearn

    from sklearn import tree

    features = [[140,1],[130,1],[150,0],[170,0]]

    labels = [0,0,1,1]

    clf = tree.DecisionTreeClassifier()

    clf = clf.fit(features, labels)

    [x]=(clf.predict(X = [[150, 1]]))

    if [x]==1:

    print('orange')

    else :

    print('apple')

  • Reply Hand Of LEGION January 15, 2019 at 3:18 pm

    Well….. understood nothing.

  • Reply Timothée Oliveau January 16, 2019 at 9:27 pm

    Tried to run the code, got "Launching humanity destruction process".

  • Reply Aero Mateen January 19, 2019 at 5:40 am

    Cool.

  • Reply Abdullah Hussain January 20, 2019 at 1:32 pm

    Google developer using Mac 😂

  • Reply Kun Yu Tsai January 21, 2019 at 10:39 pm

    Awesome class for a beginner. Thank you!!

  • Reply ImSalman January 22, 2019 at 11:46 am

    How do I make it say orange or apple instead of giving me a number?

  • Reply Jay Kadam January 22, 2019 at 6:37 pm

    Thats amazing, Thanks!

  • Reply king kong January 23, 2019 at 3:43 am

    up

  • Reply Arwa Kalavadwala January 24, 2019 at 9:20 am

    so with this code can i make a tiny robot project which can differentiate between fruits?

  • Reply Solve Everything January 24, 2019 at 10:15 pm

    I'm a mathematician and proper ML is too difficult . No way I'm going to learn the inner nature of the perceptor, neural layers and the stochastic models for the approximators and classifiers. Good luck learning the core of the inference behind GANs and convolutional models without years of intense study.

  • Reply Dwi Fajar Dandy Saputra January 26, 2019 at 4:37 am

    Thank you Google

  • Reply MygenteTV January 27, 2019 at 1:45 pm

    how to use this with c++ instead of python?

  • Reply Seba Contreras January 28, 2019 at 12:41 am

    1:34 Cristiano Ronaldo do not agree with that

  • Reply Maximilian Karelshteyn January 30, 2019 at 4:35 pm

    It would be fun to program an AI which can play LoL and learn from every game from scratch.I dont know anything about self learning AIs my knowledge is like how to program on java a tree, a house and how to get them on another position or change their colour and i know how to start a database that's , literally , it.So my question is : Is it even realistic or is it something i need to go to a university for?

  • Reply Aakash Kumar February 1, 2019 at 1:55 pm

    What editor is used to make this video : )

  • Reply David Lloyd-Jones February 3, 2019 at 6:32 pm

    Brilliant way to number your videos — with Roman numbers (of an irregular number of digits) tacked on at the right-hand end of the title.

  • Reply sanjiv 070 February 6, 2019 at 4:36 pm

    i m only getting y as the output

  • Reply Nguyen Thien Lam February 14, 2019 at 8:53 am

    why do you need the double [] in the print command?

  • Reply Andrew Lemley February 15, 2019 at 4:33 am

    Hey you might want to show what to do when you download anaconda because when I did it didn’t show up nor did anything else work

  • Reply Venkatesh R February 16, 2019 at 4:52 pm

    I was thinking of Machine learning and boom here it is! What kind of sorcery is this Google?

  • Reply Rabeeh T A February 20, 2019 at 4:21 am

    is there any ML libraries for JavaScript, i know that a little bit than python.

  • Reply Tirth Patel February 21, 2019 at 2:45 pm

    Why do we pass 2D array in this line instead of 1D array? – clf.predict([[150, 0]])

  • Reply Lucas Lima February 23, 2019 at 4:46 pm

    #version: 3.7.2

    from sklearn import tree

    features = [[140, 1], [130, 1], [150, 0], [170, 0]]

    labels = ['Orange', 'Orange', 'Apple', 'Apple']

    clf = tree.DecisionTreeClassifier()

    clf = clf.fit(features, labels)

    print clf.predict([[145,1]])

  • Reply ScratchPatch February 23, 2019 at 6:22 pm

    6:40 – The text got blurred, the ai doesnt want people to learn to code them lol

  • Reply Allan Cheah February 24, 2019 at 1:14 am

    Hey this is awesome! =)

    Oh by the way, is it ok if I can also seek your advice in this open source android app I have posted below? Just need some feedback about it…

    Kindly search ' pub:Path Ahead ' in Play Store (P & A are case sensitive).

    give_thanks you !

  • Reply Dimitar Tsvetkov February 24, 2019 at 10:21 pm

    print clf.predict([[150, 0 ]])
    ^
    SyntaxError: invalid syntax
    That's say 🙁

  • Reply Pushkaraj Sadegaonkar February 25, 2019 at 3:16 pm

    Really nice video about introduction to ML programming.

  • Reply lisichka ggg February 27, 2019 at 7:11 am

    In the video he told that : "The more training data you have – the better", however what about overfitting?

  • Reply livingthedream March 3, 2019 at 3:08 pm

    -Don't think; let the machine do it for you!
    -Thanks "They Live-movie"-Cow.

  • Reply Richard Benoit March 4, 2019 at 12:55 pm

    Thanks for the post.

  • Reply Ryan K. March 6, 2019 at 7:26 am

    Problem: every time i type import sklearn or from sklearn imort tree, it gives me a unresolved error. Please help.

  • Reply Kirill Bezzubkine March 6, 2019 at 8:42 am

    Awesome sip of ml

  • Reply Sime Arsov March 7, 2019 at 10:22 pm

    Wow, google developers that use a mac book. How do I take your video seriously after this?

  • Reply Jaydan Doano March 15, 2019 at 9:32 pm

    i cant download conda it comes out with a pkg anything help please

  • Reply Navneet Kumar March 16, 2019 at 10:51 am

    Nice Presentation for a Beginner like me. Good Lecture

  • Reply Jorge hernandez March 20, 2019 at 3:05 am

    Is it possible to do it
    using Java instead of Python ?

  • Reply Krishnadas PC March 21, 2019 at 4:09 am

    Great introduction.👍

  • Reply Akin Pounds March 26, 2019 at 12:45 am

    what ide is this?

  • Reply Defcon1Gaming March 26, 2019 at 3:20 am

    Fire whoever used a red apple in the green and orange pixel example.

  • Reply Buddhika P. De Silva March 31, 2019 at 2:50 pm

    code in the video didnt work for me. it shows msg like this >
    AttributeError: module 'sklearn.tree' has no attribute 'DecisionTreeClassifier'

    after that i make some changes and its work!
    from sklearn.tree import DecisionTreeClassifier
    clf = DecisionTreeClassifier()

    #weight in grams
    #0 – bumby 1 – smooth
    features=[[140,1],[130,1],[150,0],[170,0]]
    labels = [0,0,1,1]
    # 1 – org 0 – apples

    clf = clf.fit(features,labels)
    print (clf.predict([[130,1]]))

  • Reply Krishiv Agarwal March 31, 2019 at 3:23 pm

    No, the easiest way to download any python library is to use Pycharm.

  • Reply SkvProgrammer April 1, 2019 at 7:30 am

    very useful content

  • Reply Rajeshwar S April 2, 2019 at 9:47 am

    But let me know what decision it makes if the data input is out of training data, <100,bumpy what decision it takes? thats why we study machine learning else i could have satisfied with c program itself atleast DOS

  • Reply Samir Maliqi April 3, 2019 at 5:47 pm

    WTF to do with that pkg FILE???

  • Reply MACHINE_BUILDER April 7, 2019 at 6:02 am

    Awesome! Just wrote a bot for 2048 which learns from me (And I suck lol) using the sklearn toolkit to predict the best move 🙂

  • Reply Lamar Medina April 8, 2019 at 12:16 pm

    thanks!

  • Reply Nijimura San April 11, 2019 at 6:08 am

    Please Bring java too

  • Reply G V V Karthikeya April 15, 2019 at 7:31 am

    Its showing a syntax error at line number 6 in your program

  • Reply Anirban Maitra April 21, 2019 at 8:14 am

    Can somebody tell me the best source to learn machine learning….Please provide its link too i would be greatful
    Thanku

  • Reply G. Visal April 22, 2019 at 4:43 pm

    thanks for your videos 🙏

  • Reply Miguel Ramirez April 25, 2019 at 9:36 pm

    Awesome little tidbit! Had to run it in Anaconda3.

  • Reply alkerbix April 26, 2019 at 8:15 pm

    Awesome

  • Reply alkerbix April 26, 2019 at 8:15 pm

    Awesome

  • Reply Donald Faulknor April 30, 2019 at 2:23 am

    Doesn't make sense. Nothing was stored in memory or a database. So how does the "machine" retain information in order to learn from the previous guesses?

  • Reply Antopia HK May 2, 2019 at 3:05 am

    would you recommend making a graph like such when making machine learning?

  • Reply Guz Man May 9, 2019 at 6:45 pm

    Python 2.7 ….

  • Reply U live u learn And regret May 14, 2019 at 2:07 am

    if it says python 3.7 (32-bit) on windows is that the same as hello-world.py on mac?

  • Reply Obi-Wan Kenobi May 15, 2019 at 9:13 am

    Sweet, I’ve always wanted to get into machine learning

  • Reply Antonio Williams May 18, 2019 at 9:54 pm

    Overfitting, high variance

  • Reply Adrian Snipes May 25, 2019 at 8:17 am

    Great video but just two things i wanna point out:
    1. While technically not wrong, when you labeled bumpy as 0 but orange as 1, then smooth as 1 but apple as 0. Idk just since you want them to correlate makes more sense to have them 0:0 1:1
    2. The example you wanted it yo predict was already one of the small sample size so it didn't really show its capabilities as well as it could've.
    Nice vid regardless though.
    edit few missed words :p

  • Reply Jacky Wong June 2, 2019 at 10:10 am

    This is the tutorial for me

  • Reply Jerry Liang June 4, 2019 at 2:46 pm

    Excuse me! For the training statement, "clf = clf.fit(features, labels)", is the assignment required? I see in the later recipes, the training statement is simply "clf.fit(…, …)" without assignment. Could you please help? Thanks!

  • Reply Shanilka Ariyarathne June 5, 2019 at 12:39 pm

    can someone tell me , what is the IDE he is using there !!!

  • Reply pjossy joshi June 11, 2019 at 6:47 pm

    What if we have apple orange and banana.

  • Reply CUNEYT TASLI June 12, 2019 at 1:40 am

    Great video. Looking forward to watching the rest.

  • Reply Fernando Lovera June 16, 2019 at 3:35 am

    Hello little dude, ML is not a recipe. Stop confusing people, this is a disgrace to the field.

  • Reply Deniz Boz June 16, 2019 at 4:26 pm

    This guy's great.

  • Reply Ahmet GÜRBÜZ June 17, 2019 at 10:44 am

    what is the difference between data mining and machine learning?

  • Reply ridhwaans June 19, 2019 at 9:57 pm

    my ML guru

  • Reply Akhil Y June 24, 2019 at 7:32 am

    What is the point of using Anaconda? Can someone please help me out

  • Reply vikas June 27, 2019 at 5:17 pm

    Very very informative video. Big 🙏🙇 to you bro

  • Reply Rizwan Rauf June 30, 2019 at 12:04 am

    very well explained.

  • Reply Charles - July 1, 2019 at 3:19 pm

    how do I write brackets on a french qwerty keyboard?

  • Reply Kenton Banyai July 2, 2019 at 8:26 pm

    Well thats not fair, you can't compare apples and oranges

  • Reply Nikkolos The Kidd July 3, 2019 at 6:34 pm

    Does miniconda work too?

  • Reply rohith kattamuri July 4, 2019 at 5:38 pm

    Build your first music recommendation system model. Feel free to fork and star this project: github.com/rkat

  • Reply ᎯᏌᎿᏫᎦᏂᏫᎿᏃ July 6, 2019 at 4:00 pm

    Well done sir. Thanks for the help. I really appreciate it and considering I'm understanding this at a young age (12) tells me that other people should understand it.

  • Reply Tech Soft July 17, 2019 at 3:33 am

    visit website c# and @t @

  • Reply Abel Arredondo July 18, 2019 at 1:39 am

    Siraj Raval has this if your interested in learning more

  • Reply Kayumuzzaman Robin July 20, 2019 at 6:13 pm

    too good! <3 i'm loving it!

  • Reply Marco Scale July 25, 2019 at 9:21 am

    don't recognise "import sklearn"
    I have installed Anaconda

  • Reply Michelle Barraclough July 25, 2019 at 2:59 pm

    cant import sklearn on windows i have everything installed why?

  • Reply LEARN! SHARE! and GROW! July 26, 2019 at 4:02 am

    Thanks josh sir!

  • Reply Austin Ma July 30, 2019 at 11:20 am

    I feel like you skipped the billion steps it takes to open your magical python file. I'm on a windows, but how did you go from Anaconda to a normal python file?

  • Reply thundertwinPlaysMC August 1, 2019 at 4:30 pm

    help, i keep getting this error

    Traceback (most recent call last):
    File "C:UserskaiserDesktopmlscript.py", line 1, in <module>
    from sklearn import tree
    File "C:UserskaiserAnaconda3libsite-packagessklearn__init__.py", line 76, in <module>
    from .base import clone
    File "C:UserskaiserAnaconda3libsite-packagessklearnbase.py", line 13, in <module>
    import numpy as np
    File "C:UserskaiserAnaconda3libsite-packagesnumpy__init__.py", line 140, in <module>
    from . import _distributor_init
    File "C:UserskaiserAnaconda3libsite-packagesnumpy_distributor_init.py", line 34, in <module>
    from . import _mklinit
    ImportError: DLL load failed: The specified module could not be found.

    >>>

    this is my code

    from sklearn import tree

    info = [31, 21, 40, 71, 60, 80]
    labels = [1, 1, 1, 0, 0, 0]
    clf = tree.DecisionTreeClassifier()
    clf = clf.fit(info, labels)
    print (clf.predict([27]))

  • Reply Funny August 3, 2019 at 6:42 am

    Best video ever 🔥

  • Reply Ollie White August 3, 2019 at 9:34 pm

    What application are you using to code in here?

  • Reply SNKRhead Games August 7, 2019 at 11:36 pm

    What is the System.out.println(“hello world”) of machine learning?

  • Reply JamBear August 11, 2019 at 4:16 pm

    This is so handy, thank you!

  • Reply Linjo 100 August 11, 2019 at 4:45 pm

    is he also a robot?

  • Reply Samuel Davidson August 11, 2019 at 11:38 pm

    I was trying to follow this project on my Raspberry Pi 2 Raspbian OS but couldn't even install scikit-learn. Any suggestions? I tried a few forums but couldn't find a solution that worked.

  • Reply pablo marcel August 13, 2019 at 4:49 am

    good info!

  • Reply AcromaticGaming - Minecraft & More August 13, 2019 at 7:55 am

    6
    Line
    Of Code
    And 6
    Minute
    Of Video

    Did you see my comment was 7-1 lines a second ago????

  • Reply Alpha Garrett August 14, 2019 at 8:59 pm

    Outstanding communicative clarity! How rare.

  • Reply تليفزيون اليوتيوب August 17, 2019 at 12:56 am

    Thank you really this video show me alot

  • Reply Anusha K August 21, 2019 at 7:00 pm

    Does Transfer learning comes in Machine Learning or Deep learning? Neural networks come in Deep learning and transfer learning uses
    Neural networks..ugh I'm confused 🙆🙆🙆 please help..

  • Reply Kvarks August 27, 2019 at 10:16 pm

    What should I study to understand this?

  • Reply Mario G. August 29, 2019 at 12:03 pm

    Code this up and you can play "What's my fruit?" on your computer.

  • Leave a Reply