Get starting with data - load data into your IDE

Finding the data is the hardest part, prove me wrong #datascience #datasciencetok #python #swe #datavisualization #dataanalytics #codinglife #vscode #ide #rstudio #positron #pycharm #jupyternotebook

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

This is how you can get started working with your data inside of your code environment.

Step 1: pick your dataset

Step 1 is to pick your dataset. You can grab data from a ton of different sources. A few common ones are from your local files, from any CSV that you've downloaded. You can use API requests, which pulls live data from the web, or you can use some built-in datasets from libraries like scikit-learn, seaborn, or huggingface.

Step 2: create a new script in your IDE

Step number 2 is to create a new script in your IDE. I always use Positron . Positron is built for data apps, so it is great for this sort of workflow. Open the app, click file, new file, and new python script. Then you can save the file in the same folder as your dataset if you're working locally.

Step 3: load the data

Then step 3 is to actually load the data. Once you've got it loaded into your notebook, you can use some common commands like df.head, df.info, or dataframe.describe to see some summary statistics.

Whatever project you plan to work on, model training, dashboards, or just playing with data, step 1 is always getting it into your environment, and Positron makes that such a smooth experience.

Whatever project you plan to work on, model training, dashboards, or just playing with data, step 1 is always getting it into your environment, and Positron makes that such a smooth experience.

Make sure you leave any questions in the comments below and follow along for more data science content.

Featured software#