I got to interview Buck Woody about Data Science at Microsoft Ignite 2017! 🥳
In this interview, we chat about Microsoft Business Analytics and AI (formerly known as Cortana Intelligence Suite), Artificial Intelligence in Excel, intent-based programming, Predictive Analytics, DevOps for Data Scientists and life-long learning.
Data Science with Buck Woody - Microsoft Ignite 2017
Inserts, updates and deletes on large tables can be very slow and expensive, cause locking and blocking, and even fill up the transaction log. One of the main benefits of table partitioning is that you can speed up loading and archiving of data by using partition switching.
Partition switching moves entire partitions between tables almost instantly. It is extremely fast because it is a metadata-only operation that updates the location of the data, no data is physically moved. New data can be loaded to separate tables and then switched in, old data can be switched out to separate tables and then archived or purged. All data preparation and manipulation can be done in separate tables without affecting the partitioned table.
T-SQL Tuesday #65 is hosted by Mike Donnelly (@SQLMD). There is no specific topic to write about this month, Mike simply wants us to learn something new and then write a blog post to teach it to others. I want to share something that I only recently learned, something I wish I had known about years ago, something that became part of my toolbox as soon as I discovered it: the Numbers Table. It is so simple and solves so many problems that everyone should know about it 🤓
A Numbers Table (perhaps most known as a Tally Table, sometimes called an Auxiliary Table of Numbers and even referred to as the Swiss Army Knife of SQL Server) is a one-column helper table that contains the numbers 1, 2, 3, 4, 5 and so on all the way up to the-highest-number-you-could-possibly-need.
It can be used to replace slower loops and row-by-row operations with faster set-based operations, generate dates, split strings, find gaps in data sets, expand data sets, insert test data and probably hundreds of other things. There are so many great and detailed articles already published about this topic, so I will stick to the T-SQL Tuesday topic of “Teach Something New” and share the two most recent things I learned: different ways to create a numbers table, and different ways to quickly insert test data by using a numbers table.
There are many benefits of partitioning large tables. You can speed up loading and archiving of data, you can perform maintenance operations on individual partitions instead of the whole table, and you may be able to improve query performance. However, implementing table partitioning is not a trivial task and you need a good understanding of how it works to implement and use it correctly.
Being a business intelligence and data warehouse developer, not a DBA, it took me a while to understand table partitioning. I had to read a lot, get plenty of hands-on experience and make some mistakes along the way. (The illustration to the left is my Table Partitioning Cheat Sheet.) This post covers the basics of partitioned tables, partition columns, partition functions and partition schemes.
The table below is a simplified mapping between SQL Server, SSIS and Biml Data Types. It does not include all possible mappings or all data types, but is meant as a quick reference while developing and learning Biml. I mainly created it as a cheat sheet for myself, but I hope other Biml developers will find it useful!
Can't remember this post's URL? Neither can I 😅 You can use this short URL instead: cathrinew.net/BimlDataTypes
Cheat Sheet for SQL Server, SSIS and Biml Data Types
SQL Server
SSIS
Biml
bigint
DT_I8
Int64
binary
DT_BYTES
Binary
bit
DT_BOOL
Boolean
char
DT_STR
AnsiStringFixedLength
date
DT_DBDATE
Date
datetime
DT_DBTIMESTAMP
DateTime
datetime2
DT_DBTIMESTAMP2
DateTime2
datetimeoffset
DT_DBTIMESTAMPOFFSET
DateTimeOffset
decimal
DT_NUMERIC
Decimal
float
DT_R8
Double
geography
DT_IMAGE
Object
geometry
DT_IMAGE
Object
hierarchyid
DT_BYTES
Object
image (*)
DT_IMAGE
Binary
int
DT_I4
Int32
money
DT_CY
Currency
nchar
DT_WSTR
StringFixedLength
ntext (*)
DT_NTEXT
String
numeric
DT_NUMERIC
Decimal
nvarchar
DT_WSTR
String
nvarchar(max)
DT_NTEXT
String
real
DT_R4
Single
rowversion
DT_BYTES
Binary
smalldatetime
DT_DBTIMESTAMP
DateTime
smallint
DT_I2
Int16
smallmoney
DT_CY
Currency
sql_variant
DT_WSTR
Object
text (*)
DT_TEXT
AnsiString
time
DT_DBTIME2
Time
timestamp (*)
DT_BYTES
Binary
tinyint
DT_UI1
Byte
uniqueidentifier
DT_GUID
Guid
varbinary
DT_BYTES
Binary
varbinary(max)
DT_IMAGE
Binary
varchar
DT_STR
AnsiString
varchar(max)
DT_TEXT
AnsiString
xml
DT_NTEXT
Xml
(*These data types will be removed in a future version of SQL Server. Avoid using these data types in new projects, and try to change them in current projects.)