Say, you have a query having performance issues and you decide to look it's graphical execution plan in SSMS (or other tools). If you are lucky, right off the bat SQL Server reveals a quick and easy fix:
Its missing a crucial index! You add the missing index and the problem solved, you are the hero of the hour!!!!
Sooner than later, you will come across queries that don't have such obvious fixes aka low hanging fruits. And, you will realize, analyzing execution plans is very tedious, time consuming and its probably impossible to read every single property of every single node of every single query. You decide to be smarter about it so you apply the pareto principle 1) focus your energies on 20% queries causing 80% of performance issues (or maybe there is no performance issue to solve per se but you just want to make things run better) and 2) Within execution plans for those 20% of queries, you devote your attention to the highest impact or informative operators of the execution plans. For example:
Plan metadata:
Start by first reviewing the plan metadata. It is the very first operator listed on the far left of the plan, it contains the metadata about the plan and as you can probably see in this screen shot, the information displayed is at least, informative:
At the bottom you see the “Warnings” that are associated with this plan, all in one place.
Warnings:
Warnings are indicators of a possible issue with the query that could affect the performance. This doesn't necessarily mean that its a real problem. It's just an indicator of a possible problem and should be investigated.
The tooltip will show the reason for the warning:
In that particular case if I am not using this column in a where clause, then this is most likely a false warning as it cannot then affect the row count estimates. Whereas in the following example both queries show warning in the execution plan due to the use of the CAST function. However its the second query where it's negatively affecting the performance because the war due to the use of CAST function in the WHERE CONDITION, making it essentially unsargable. First you will notice that the second query is switched to doing scan vs seek operator being used by the first query.
And secondly, the second query requires significantly (relatively speaking) more I/O to get the same results.
(1 row affected)
Table 'TBL_______________'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 67, lob physical reads 0, lob read-ahead reads 0.
(1 row affected)
(1 row affected)
Table 'TBL_______________'. Scan count 1, logical reads 1367, physical reads 0, read-ahead reads 0, lob logical reads 69, lob physical reads 0, lob read-ahead reads 0.
(1 row affected)
Most Costly Operations:
Even though the costs listed are estimates only, optimizer does rely on those estimates and therefore we must too pay more attention to them.
Fat Pipes:
Pipes, are the arrows that connect one operator to the next and represents data movement. Width of the pipes indicate amount of data being processed so a thick pipe would indicate lots of data and thin would mean less data. Another thing you would want to pay attention to is the change in pipes width from thin to fat or from fat to thin or, a fat pipe getting even fatter.
Scans:
Relatively speaking, seeks are better than scans. It’s not that Seeks are always good or that scans are always bad but scans do indicate I/O which is often an issue for query performance. Some times optimizer will choose scan even if a seek option is available and often it's right. In a simplistic example, if my query SELECT * FROM MYTABLE; that contains no WHERE clause or any kind of filtering condition at all, in that case the optimizer can only choose a scan, In fact for such queries a seek will run slower than a scan.
In short, look at the scan operators more closely than say seek operators.
Huge discrepancies between estimated and actual counts
For example, in the below screenshots the estimated rows is only 8 but actual plan returned 5.1 million rows, that's a huge discrepancy.
2) Parameter sniffing 3) a bad query plan getting cached (use WITH RECOMPILE to get rid of it) or sometimes combination of more than one factors.
Another discrepancy to pay attention is to look if there is a very high “Estimated Number Of Rows to be Read” compared to very low “Estimated Rows Per Execution”. This maybe indication that your WHERE clause is unsargable. A very simple example it would be to use aggregate functions (sum, count etc.) against all rows.
Final thoughts:
Often this approach works to quickly identify query performance problems. Even if it doesn't eventually, its a good starting strategy. And I would admit this has its limits and won't always be enough or "good enough" and you do then have to put in real hard work in understanding and optimizing your query.
At that point its good idea to start paying attention to the the NodeIDs of each operation. NodeIDs are integer values starting with 0...n, from left-to-right. It lets you know order in which your query is being processed by the SQL Server... Some operators even refer back to other operators.
Next, keep track of output of each operator, some operators add or change columns to the query results so knowing where each column came from can be very helpful.
And finally, to get a complete picture, get into habit of reviewing properties of each operator and not just look at the information in the tooltip.