According to StackOverflow's Developer Survey, Python was the 4th most popular language among developers in 2022
Python is common in many domains:
Python is a very popular language (probably the most popular language) for data science, for a few reasons:
A language-agnostic integrated development environment (IDE)
The most popular IDE right now, according to surveys
Supports notebook files, a common format for data scientists
.ipynb
extension) are designed to work with an application called JupyterHowever, VSCode can be used instead, via a Jupyter Extension supported by Microsoft itself.
You'll need to have a Python environment set up for it to run the code in.
VSCode shows a file browser on the left pane (you may have to click the document icon).
If you open the folder containing the course files, you'll be able to see how it's organized: we have folders such as scripts
, notebooks
, and .github
.
notebooks
-- that's where the content of the training lives.These slides, for example, were created from the notebooks/01-Python-and-Notebooks.ipynb
file.
By interleaving code and commentary about the code, notebooks provide excellent documentation in data science tasks
.ipynb
(interactive python notebook)You can create new notebooks by making a new file with a .ipynb
extension.
When you open it, VSCode is smart enough to infer that it's a notebook based on that suffix.
Notebooks are organized by cells. These cells are at the core of a notebook:
You can see what type of cell you're currently editing by looking in the bottom right corner, which will say "Markdown" or "Python".
Code cells are meant for -- you guessed it -- running Python code. To do so:
If the last line of a code cell is an expression, its result will automatically be printed below the cell.
a = 3
b = 4
a * b
12
There are a few kinds of non-code cells, but the most common is Markdown.
Markdown is a simple language for creating styled text using only plain text
These cells are meant for providing supporting commentary around the code cells
They also support:
Non-code cells are "rendered" when you hit CTRL + RETURN or SHIFT + RETURN
When not editing text:
a
-> create a new cell before the one that's selected
b
-> create a new cell after the one that's selected
m
-> switch a code cell into a markdown cell
y
-> switch a markdown cell into a code cell (think Python)
print(1 + 2)
and run it.Are there any questions before moving on?