Your database is your greatest business asset. It holds all your customer data. You need to properly work with your database to avoid mistakes. An update query changes records in your tables, but used improperly, you can overwrite records accidentally. If you don’t have a backup, your data is corrupted. Update queries aren’t complex, but it’s best to understand the format before changing multiple records.
A Basic Update Statement
The following code is the very minimum you need to write an update statement in SQL:
update customer set first_name=’Jack’
The above statement sets all records to a first name of “Jack.” Notice that it changes all records. The statement identifies “customer” as the table you want to change and the “first_name” column is what you update. Finally, the ‘Jack’ string is what replaces any current values.
One issue to note in the above statement is that it’s a valid SQL statement, but it probably is not what you want to do. Changing all records is usually disastrous, and you’re forced to recover using a backup. That’s where the “where” clause comes into effect. You need the where clause to filter out the records you don’t want to change. You can update one, hundreds, thousands or even millions of records with only update statement.
Adding the Where Clause to Your Update Statement
The above statement is not really a typical update SQL query. Most update statements add a “where” clause. The following code is an example of an update statement with the where clause:
update customer set first_name= ’Jack’ where customerId = 5
The first statement shown updates all records, but the above statement just changes one record. The “customerId” column is assumed to be a unique column in your table. The customer’s first name is then changed to “Jack.” It doesn’t matter what the first name is set to previously in this statement. The only thing the SQL database cares about with this statement is the customer Id.
You can make spelling corrections with the update statement. For instance, suppose you want to change spelling mistakes from “jck” to “Jack” or you can replace data with proper capitalization. The following update statement changes “jck” to “Jack”:
update customer set first_name= ’Jack’ where first_name= ‘jck’
The above statement only changes records where the first name (first_name column) equals “Jck.” Capitalization does not matter in this statement, so “jck” or “Jck” doesn’t matter. However, the first name column will change to a capitalized version just like it’s typed out in the statement.
The following statement updates an improperly capitalized value to “Jack”:
update customer set first_name= ’Jack’ where first_name= ‘jack’
The above statement makes sure you have properly capitalized first names.
Updating Multiple Columns at Once
The above statements updated just one column, but you can update multiple columns at once. The following code updates the first and last name columns:
update customer set first_name= ’Jack’, last_name = ’Smith’ where customerId = 5
Just like the previous update statement, this where clause filters out all customers except the row with the id of 5. The first and last name columns are edited to Jack and Smith respectively.
Using Multiple Where Clause Filters
You can use multiple where clause filters to specify records you want to update. In the above example, you just update a customer with an id of 5. But sometimes, you don’t know the customer’s id and you just have a few basic business logic rules to follow. Suppose you only know that you have a list of customers that signed up within the last week and you need to update the country id from another table. There are a number of reasons it can happen, but you just know that you need to update some records.
First, you need your “select” query. A select query is what retrieves a list of records. You can use this select query to get the list of customers you need to edit. Second, you also need to retrieve the right country id from your country table.
The first step is to get your country select statement. The following code is an example of a select statement that gets the country code for the zip code ‘33071’:
select countryid from countries where zipcode = ‘33071’
There are several ways you can update your database, but it’s best to update small buckets of rows instead of millions at a time. First, an update statement takes up much of your server’s resources. The locks on the tables can kill performance. This means that if you’re running an update statement on a live database table, you can create errors with your critical components like an ecommerce shopping cart or customer interaction with your site.
Once you become more confident with your queries, you can update more records at a time, but you should run these queries carefully and during off-peak hours.
Next, you create a select query to sample the list of records you’ll update. Using the select statement, you can ensure that your update statement only affects the right records. Using the current scenario, the following select statement gets a list of records that consists of users who signed up within the last week:
select * from users where signup_date >= getDate() – 7 and zip_code = ‘33071’
Of course, this is not your update statement, but this lets you sample data. The where clause is what’s important. The above statement returns a list of users who signed up in the last seven days and also has a zip code that matches the subquery you wrote earlier. You can get more dynamic with this query, but again, you should run update statements in blocks to limit the amount of records to change. This is the clause you use in your update statement to ensure that you update the right records. To turn the above select statement to an update statement, use the following code:
update users set countryid = <subquery> where signup_date >= getDate() – 7 and zip_code = ‘33071’
Your update statement is almost complete now. Notice there is a “<subquery>” placeholder. Right now, the query will return an error, because this is improper syntax. The following query puts the whole update query together:
update users set countryid = (select top 1 countryid from countries where zipcode = ‘33071’) where signup_date >= getDate() – 7 and zip_code = ‘33071’
The above query might be confusing to some people who aren’t used to SQL. The subquery gets a country code from an external country table. The country code matches the zip code “33071.” In this case, the country is the US, so whatever Id is assigned to US is retrieved. The “top 1” shouldn’t be necessary, but it’s a preventative if you have more than one record for the zip code. If you return more than one record in the subquery, SQL will return an error and tell you that more than one record was returned. To prevent this problem, the “top 1” phrase only returns one record. Logically, only one value should return for that specific zip code. If you have more than one country code returned, you have a data integrity problem. If the country code is returned twice, you probably have duplicate records in the database and this can cause other problems later on with other SQL statements.
More Advanced Update Queries
With the above query, you need to run an update for each zip code in your database. If you only have a few zip codes to work on, this code is probably enough and you can just run your query for each zip code separately. However, if you have dozens, it’s not efficient. You can take your query a step further and create an update statement that’s dynamic and lets you update several records properly. Again, you want to run the same select query to ensure that the records you want to update are precise. This time, though, you take out the zip code in the where clause, because you want to update all records. The following samples the data:
select * from users where signup_date >= getDate() – 7
The following is the complete update query:
update users set countryid = (select top 1 countryid from countries where countries.zipcode = users.zipcode) where signup_date >= getDate() – 7
The above query will update all records created with a signup date within the last 7 days. The “getDate() – 7” in the where clause filters out all records except those within the last seven days.
The difficult part to understand is the subquery that actually performs the update. Notice that the where clause in the subquery uses the users and the countries table. The subquery links the two tables together to return the country code for each record in your update statement. What this statement basically says is “look up the user’s zip code in the users table and match that record with the corresponding zip code in the countries table.” The result should only be one record (but again “top 1” is used to avoid the error). The matching country code is then inserted into the users table in the countryid column. The result is that you can update several zip codes at once and run a dynamic query that matches the user’s address with the appropriate country.
You can also use an “in” statement to filter out records. The most important part of any update statement is the where clause, because it determines the records that update in your table. You might want to use an in clause with your where clause. There are a number of reasons you might want to use an in statement. For instance, you might only want to update a list of customers who have an order. You need to get a list of orders in the orders table. The following code uses the above update statement but changes the business logic for what records to update:
update users set countryid = (select top 1 countryid from countries where countries.zipcode = users.zipcode) where customerId in (select customerId from orders)
The above statement changes the same column, but it only changes customers who have an order in the orders table. It still uses the same subquery, so records are updated with the right country code in your countries table. The only part that changes is the business logic in the where clause.
In the above query, all customers with an order are updated. This can be several records, and it will harm performance if you have any other services running off of the database. For example, you could limit the number of customers who placed an order within the last seven days. The following query does just that:
update users set countryid = (select top 1 countryid from countries where countries.zipcode = users.zipcode) where customerId in (select customerId from orders where order_date >= getdate() - 7)
In the above query, the business logic changes just a bit. Until you understand SQL fully, you probably need some explanation when queries are changed slightly. Just a small change can greatly affect the records that update. In this query, instead of retrieving signup dates in the customer table, SQL gets records from the orders table. If the user placed an order within the last 7 days, it returns the record. Then, the “outer” query runs. The outer query updates the users table from the list of users returned by the “in” subquery statement. You could even add a user’s signup date to combine a list of users who signed up within the last 7 days and from those users, only update ones with an order within the last 7 days. The query would look like the following:
update users set countryid = (select top 1 countryid from countries where countries.zipcode = users.zipcode) where signup_date >= getDate() – 7 and customerId in (select customerId from orders where order_date >= getdate() - 7)