Make the DB Optimized
Recently I got a issue with slowness in a data pipeline.Real issue was particular flow id was taking so much time to get it done. But the same time there are some other things are too supposed to happening.It is because of previous task took lot of time.
I had a look on that particular SQL file and search for that particular tables with go with that query. Then I get a sample instance ids and execute to get a idea . didn’t get much idea, so i decided to use explain sql key word.from that usually i am getting how many raws are getting affected and what can i do for make it fast. mmmm… wasn’t very much help full actually. Now what i have decided to go for DDL file and check what’s happening .
I have not index the table , wait i can use that to make it fast .but i just need to know more about this so these are the ways that make you query fast.
while we are updating the tables from the SQL query, an UPDATE statement writes longer to a table than a CASE statement, because of its logging. An inline CASE statement chooses what is preferred before writing it on the table, thus making it a very simple way to speed up your SQL query.
so instead of just telling that I would like to give some examples
SQL query returns the maximum hourly rate for each job title in the HumanResources.Employee table. The HAVING clause restricts the titles to those that are held by men with a maximum pay rate greater than 80 dollars or women with a maximum pay rate greater than 60 dollars.
focus on the two CASE statements with in the HAVING statement , and read what we are gonna get from the above paragraph.Then easy to grab it once.
we could store some rows on one machine and others on another machine. The idea to split the data by row is called horizontal partitioning.
PARTITION BY RANGE(id)
Partition p0 VALUES LESS THAN (1234),
Partition p1 VALUES LESS THAN (4567),
Partition p2 VALUES LESS THAN MAXVALUE
Using OR statement is kind of tricky , if you are gonna match two couple of ID s at same query in joined table using OR keyword ,it has heavy impact on the performance.Basically ,instead of doing this,
try this kind of behavior:
from that we can reduce the execution time and the total reads happens on the SQL server.
Even though i mentioned that I have used the index ,didn’t explain how it’s work.Here we go:
Assume we have an index for a primary key. This will create an ordered list of primary key values in a separate table, each entry has a pointer points to the relative value in the original table.
So, whenever we want to access a table using the primary key, it will use binary search algorithm (takes time of O(LogN)) to access the required value in the Index table, and then, go to the relative value in the original table.
when we are using composite indexes by having multiple columns ,they are getting sorted according to their values.
If we want to create an index on a combination of columns, you can list the column names within the parentheses, separated by commas.
ALTER table by create or destroy Index: