When migrating data from MySQL to BigQuery, one of the common challenges is translating table schemas between the two systems. MySQL and BigQuery use different data types and schema formats, making direct migration complex. Here's a practical SQL query that automatically generates BigQuery-compatible schema JSON from your existing MySQL tables.
The following MySQL query reads the INFORMATION_SCHEMA.COLUMNS
table and generates a JSON schema that can be directly used with BigQuery:
SELECT
CONCAT('[',
GROUP_CONCAT(
JSON_OBJECT(
'name', COLUMN_NAME,
'type', CASE
WHEN DATA_TYPE IN ('varchar', 'char', 'text', 'longtext', 'tinytext', 'mediumtext') THEN 'STRING'
WHEN DATA_TYPE IN ('int', 'tinyint', 'smallint', 'mediumint', 'bigint') THEN 'INTEGER'
WHEN DATA_TYPE IN ('float', 'double', 'decimal') THEN 'FLOAT'
WHEN DATA_TYPE IN ('datetime', 'timestamp') THEN 'TIMESTAMP'
WHEN DATA_TYPE = 'date' THEN 'DATE'
WHEN DATA_TYPE = 'time' THEN 'TIME'
WHEN DATA_TYPE IN ('blob', 'longblob', 'tinyblob', 'mediumblob', 'binary', 'varbinary') THEN 'BYTES'
WHEN DATA_TYPE = 'boolean' OR (DATA_TYPE = 'tinyint' AND COLUMN_TYPE = 'tinyint(1)') THEN 'BOOLEAN'
ELSE 'STRING'
END,
'mode', CASE
WHEN IS_NULLABLE = 'NO' THEN 'REQUIRED'
ELSE 'NULLABLE'
END
)
),
']') AS schema_json
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = '{your_database_name}'
AND TABLE_NAME = '{your_table_name}'
ORDER BY ORDINAL_POSITION;
The query performs several key transformations:
varchar
→ STRING
, int
→ INTEGER
)IS_NULLABLE
field to BigQuery's REQUIRED
/NULLABLE
modesORDINAL_POSITION
Simply replace the placeholders with your actual values:
{your_database_name}
: Your MySQL database name{your_table_name}
: The specific table you want to convertThe output will be a JSON array that you can directly use when creating BigQuery tables or during data import operations.
Warning: MySQL's GROUP_CONCAT
function has a default maximum length limit (typically 1024 characters), which can truncate the output for tables with many columns or long column names.
The GROUP_CONCAT_MAX_LEN
system variable controls the maximum allowed result length. For complex schemas, the generated JSON might exceed this limit, resulting in incomplete output.
Before running the query, increase the limit temporarily:
SET SESSION group_concat_max_len = 1000000;
Alternatively, for a permanent solution, you can modify the MySQL configuration or break down large tables into smaller chunks by filtering specific column ranges.
return;