Introduction
In this project, you will learn how to query a public dataset using Google BigQuery. BigQuery is a serverless, highly scalable, and cost-effective cloud data warehouse. By leveraging public datasets, you can explore large data sets without uploading your own data, making it a great way to learn SQL and data analysis in the cloud—free if you stay within BigQuery’s free tier limits.
Why Is It Useful?
Hands-on SQL Skills: You can practice real-world queries with large datasets.
No Data Upload Needed: Public datasets are already available, saving time and cost.
Scalable and Serverless: No infrastructure to manage, so you focus only on your queries.
Prerequisites
Sign up at cloud.google.com if you don’t have one.
BigQuery offers a certain amount of free queries each month; ensure you stay within free limits.
Includes the bq command-line tool for BigQuery.
Install it from Google Cloud SDK documentation.
You need Owner or Editor role in your Google Cloud project to run BigQuery queries.
Typically, the default project you create will allow you to use BigQuery.
Ensure the BigQuery API is enabled in your project.
You can enable it via APIs & Services in the console or using the CLI command shown below.
Step 1: Select or Create a Project
Go to Google Cloud Console.
In the top-left dropdown, select an existing project or create a new one (for example: my-bigquery-project).
Step 2: Enable BigQuery API
In the Console, go to APIs & Services > Library.
Search for BigQuery API, select it, and click Enable if it’s not already enabled.
Step 3: Opening BigQuery
Console : In the left navigation menu, scroll down to Big Data or Analytics section and click BigQuery.
This opens the BigQuery console (also known as the BigQuery web UI).
CLI : The BigQuery CLI tool is called bq. You can verify it’s installed by running: bq version
Step 4: Query the Public Dataset
We’ll run a simple SQL query on the bigquery-public-data.samples.shakespeare table.
Console : In the BigQuery console, under Explorer, expand bigquery-public-data > samples.
Click on shakespeare. You’ll see table details and a Query button.
Click Query (or New Query at top).
Paste this SQL:
SELECT word, word_count
FROM `bigquery-public-data.samples.shakespeare`
WHERE word IN ('love', 'death', 'king')
ORDER BY word_count DESC
LIMIT 10;
Click Run. You’ll see the results in the console below.
CLI :
bq query --use_legacy_sql=false \
'SELECT word, word_count
FROM `bigquery-public-data.samples.shakespeare`
WHERE word IN ("love", "death", "king")
ORDER BY word_count DESC
LIMIT 10;'
Step 5 : Run Your Query
In that query editor, you can paste the SQL, for example:
SELECT word, word_count
FROM `bigquery-public-data.samples.shakespeare`
WHERE word IN ('love', 'death', 'king')
ORDER BY word_count DESC
LIMIT 10;
Then click “Run” at the top of the editor. The results will appear below once the query finishes.
Step 6 : Verifying and Testing the Project
Conclusion
In this guide, we have:
Queried a public BigQuery dataset (shakespeare) without using any extra credits, staying within the free tier.
Learned how to do it via Google Cloud Console (GUI) and the CLI (bq).
Understood where to take screenshots for your documentation.
Explored common troubleshooting steps, from enabling APIs to verifying permissions.
Popular Projects
What is Cloud Computing ?
Cloud computing delivers computing resources (servers, storage, databases, networking, and software) over the internet, allowing businesses to scale and pay only for what they use, eliminating the need for physical infrastructure.