SQL UNION
The UNION
operator in SQL combines the result sets of two or more SELECT
queries into a single result set. It removes duplicate rows by default, ensuring the output contains only unique rows.
Why Use SQL UNION?
- To combine results from multiple queries into a single result set.
- To merge data from multiple tables or sources.
- To perform queries on data with similar structures.
Syntax
Key Points
- Same Number of Columns: Each
SELECT
statement must return the same number of columns. - Compatible Data Types: The corresponding columns must have similar data types.
- Distinct Rows: By default,
UNION
removes duplicate rows. - To Keep Duplicates: Use
UNION ALL
.
Example Dataset
Customers_US Table
CustomerID | Name | Country |
---|---|---|
1 | John Doe | USA |
2 | Alice Lee | USA |
3 | Bob Smith | USA |
Customers_UK Table
CustomerID | Name | Country |
---|---|---|
4 | Emma Brown | UK |
5 | Harry Potter | UK |
2 | Alice Lee | USA |
Examples
1. Basic UNION
Combine customers from both the US and UK without duplicates:
Result:
CustomerID | Name | Country |
---|---|---|
1 | John Doe | USA |
2 | Alice Lee | USA |
3 | Bob Smith | USA |
4 | Emma Brown | UK |
5 | Harry Potter | UK |
2. UNION ALL
Include duplicates in the result set:
Result:
CustomerID | Name | Country |
---|---|---|
1 | John Doe | USA |
2 | Alice Lee | USA |
3 | Bob Smith | USA |
4 | Emma Brown | UK |
5 | Harry Potter | UK |
2 | Alice Lee | USA |
3. UNION with Filtering
Get customers whose names start with 'A' from both tables:
Result:
CustomerID | Name | Country |
---|---|---|
2 | Alice Lee | USA |
4. UNION with Additional Columns
When combining tables with different columns, ensure you select only matching columns.
Example Tables:
- Orders_US:
OrderID
,CustomerID
,OrderDate
- Orders_UK:
OrderID
,CustomerID
Difference Between UNION and UNION ALL
Feature | UNION | UNION ALL |
---|---|---|
Removes Duplicates | Yes | No |
Performance | Slower (extra work to remove duplicates) | Faster |
Use Case | Unique combined result | When duplicates are allowed |
Equivalent Query Without UNION
Using UNION
is more efficient than writing multiple queries and combining their results manually. However, without UNION
, you'd need separate queries and merge their results programmatically or via temporary tables.
Common Use Cases
Combining Data from Different Tables:
- Example: Combine employees from different departments.
Merging Historical and Current Data:
- Example: Combine archived and active user records.
Cross-Domain Queries:
- Example: Combine data from different databases or schemas.
Performance Considerations
- Indexes: Ensure proper indexing on columns used in the
SELECT
clauses. UNION ALL
for Speed: UseUNION ALL
when duplicates are acceptable, as it avoids the additional work of eliminating duplicates.- Database Size: Be mindful of the result set size when combining large tables.
Conclusion
The SQL UNION
operator is a powerful tool for combining data from multiple sources while maintaining simplicity and efficiency. Its ability to deduplicate results by default ensures clean and meaningful outputs. However, when performance or duplicates are not a concern, UNION ALL
can be a better choice.