Check out the Natural Language category for a list of text corpora and ngrams for text analysis.
COVID-19 Open Research Dataset (CoRD-19)
A free resource of over 47,000 scholarly articles, including over 36,000 with full text, about COVID-19 and the coronavirus family of viruses
Catalog of hundreds of thousands of public data sets created at the city, state, and federal levels.
Inter-University Consortium for Political and Social Research (ICPSR)
ICPSR receives, processes, and distributes data on social phenomena in countries across the world. ICPSR maintains a data archive of on topics in the social and behavioral sciences, including specialized collections in education, aging, criminal justice, substance abuse, terrorism, and other fields. Includes survey data, census records, election returns, economic data, and legislative records.
Programmable Web API Directory
Search over 15,000 APIs, or browse by categories.
Browse and search thousands of disciplinary, generalist, and institutional data repositories that include textual data.
A subreddit for sharing and discussing datasets.