Are you looking to build, or improve, a programmatic SEO site in the books niche?
If so, today is your lucky day! We've put together the best 10 books datasets for programmatic SEO (most of them are free) that you can download and use for your own projects.
Let's jump in.
Looking to learn programmatic SEO from scratch?
Check out the full course, that teaches you everything you need to get started building programmatic SEO projects - with code, no-code, or AI content.
View the course →10 Useful Books Datasets for pSEO
Along with a brief description of all the datasets, we have also included the format(s) they are available in.
1. Goodreads Books
Available format(s): CSV
A dataset containing a comprehensive list of books listed in Goodreads, including features such as book title, author, publication date, rating, and number of ratings. It has 10,000+ records.
2. Book Cover
Available format(s): CSV
Dataset of 207,572 books from the Amazon marketplace, containing book cover images, title, author, and category for each book, split into 2 tasks: firstly, classification task of classifying books by cover image, with a training and test set split of 90% - 10% respectively and secondly data mining task of exploring the entire book database in 32 classes.
3. Books API
Available format(s): JSON
The Books API provides information about book reviews and The New York Times Best Sellers lists, including best seller lists names, list data, and book reviews by author, ISBN, and title.
4. Amazon Top 50 Bestselling Books 2009 - 2019
Available format(s): CSV
A dataset contains a list of 550 books that have been top 50 bestsellers on Amazon from 2009-2019. The dataset includes information on the book's name, author, user rating, number of reviews, price, year of release, and genre.
5. Subset of the books available in Amazon
Available format(s): CSV
The dataset includes a subset of books available on Amazon, along with user ratings. It includes three tables: one for users, one for books, and one for ratings, with explicit ratings on a scale of 1-10 and implicit ratings of 0. Datapoints includes book, publisher, year of publication, author etc.
6. HAPI Books
Available format(s): JSON
HAPI Books is an API that provides access to thousands of book records including title, genre, author, year, and other information. It allows users to search and filter books by various parameters and offers endpoints for retrieving best books by year or weekly suggestions.
7.Top 100 Young Adult Fiction
Available format(s): CSV
This dataset lists the top 100 Young Adult Fiction books according to Goodreads members, including details such as rank, title, author, description, genres, rating and 8 more datapoints.
8. books
Available format(s): JSON
"Books dataset" provides a search function for books and authors, with options to search by language, title, ISBN, subject, and author name. It includes up-to-date documentation and sample responses.
9. Goodreads Book Datasets With User Rating 2M
Available format(s): CSV
A dataset containing 2M books from Goodreads with user ratings, including information such as book title, rating distribution, number of pages, publisher, and review count.
10. Book Depository
Available format(s): CSV
A large collection of books metadata, including title, description, dimensions, category, cover image, authors, bestsellers-rank, categories, edition, edition-statement, for-ages, format, id, illustrations-note, image-checksum, image-path, image-url, imprint, index-date, isbn10, isbn13, lang, publication-date, publication-place, rating-avg, rating-count, title, url, and weight.
And that's it! I hope you found this list of books datasets useful and if you do use any of them, let me know on Twitter.
Still on the hunt for datasets? Here are some more that might give you shiny object syndrome!