![]() You’ll find datasets related to sea level rise, wildfire frequency, and tropical storms, among other interesting earth sciences insights. NASA’s Earth Science Data Systems Program is a repository of the organization’s Earth science data. If you’re interested in data, you should subscribe to the weekly newsletter. The latest edition featured a dataset of restaurant “chainness” as well as an energy-demand dataset. Since 2015, he’s published more than 300 newsletters, and you can access the full archive on his site. Data Is PluralĮach week, Jeremy Singer-Vine compiles a newsletter of useful and curious datasets. This is a great data source for a real estate data science project. However, there are numerous premium datasets available as well. To access the site’s free datasets, you’ll need to create an account to access the 20+ free sources. Some free datasets of note include Zillow Real Estate Data and Federal Reserve Economic Data. The site features both paid and free data. Nasdaq Data Linkįor FinTech machine learning projects, you’ll find a variety of finance-related datasets on Nasdaq Data Link. The data is shared by researchers, and there’s a variety of interesting sources, including the classic Enron email dataset or the annotated New York Times text corpus, which contains 1.8 million articles. Academic TorrentsĪcademic Torrents is a database for large-scale datasets for research projects. If you want to build a machine learning health project, this is the source to utilize 12. The platform features a variety of health-related statistics such as HIV/AIDs, vaccination rates, and malaria. If you’re looking for healthcare data, start with the WHO’s Global Health Observation repository. For more niche projects, try the Coronavirus Tweets Database, featuring more than 1 billion Tweets, as well as The Marshall Project’s COVID cases in prisons datasets. Some of the best sources include CDC COVID Data Tracker and Our World In Data. There’s a plethora of regularly updated public COVID data available online. You’ll find datasets in a range of categories from crime, to Twitter. data.worldĭata.world calls itself a “collaborative data community,” and the site has built a dedicated audience of data scientists who have collaborated on projects like social bot detection and data journalism. In particular, you’ll find datasets and surveys covering media consumption, social media use, and demographic trends like this 2018 Twitter Survey. The Pew Research Center’s data repository focuses mainly on culture and media. You’ll also find helpful usage examples for many of the datasets, as well as project links for various organizations and groups. Amazon Web Service Open Data RegistryĪmazon’s registry provides public access to data from a range of organizations from the 1000 Genomes Project to NASA. The data is organized by category, with options like machine learning and software, and you’ll find quick links to sources. This regularly updated library of datasets is a great place to start. This is a great source for clean, ready-to-model data in a wide range of niches from a dataset of chickenpox cases, to bank marketing data. UCI Machine Learning RepositoryĬheck out the University of California Irvine’s repository, which features nearly 500 public datasets. This is a useful source of data for sentiment analysis projects as well as data analysis and visualization projects. You’ll find a wide range of data from movie reviews to customer sales data and, fortunately, most have some of the preprocessing done. Kaggle is one of the most popular communities for data scientists, and the site’s user-published datasets are great for self-guided machine learning or analysis projects. This is a great source for a wide range of data with a focus on politics, sports and culture. Fortunately, the site also makes most of the data it uses in its reporting open to the public. FiveThirtyEightįiveThirtyEight might be best known for its data journalism. With more than 300,000 datasets available, this repository is extremely helpful. This is a rich source for public economic data-like housing, wages, and inflation-as well as education, health, agriculture, and census data. ![]() ![]() government’s free and open datasets here. Other useful Google sources are Google Trends and Google’s Public Data Directory. This is a great starting point for both paid and free datasets from top sources around the web. Google’s data search engine is useful for finding datasets in a particular niche. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |