Please truncate the log file for tempdb on (server). We’re getting an error message:
The log file for database 'tempdb' is full. Back up the transaction log for the database to free up some log space.
Now, my first thought is that perhaps, just perhaps, there is a chance that my query is causing this issue. But this rocket scientist has already decided that there is no way his query is the culprit, and in fact it is a problem with tempdb. Here's an idea...why not tell us what you were doing when you got this message? Perhaps we could help you find out where the problem is? No? Okay then.
So, we respond back, tell them that tempdb is fine, there is lots of free space (naturally, since his trx was rolled back), and to try it again. It fails. Think he would make the connection yet? Unfortunately, no. So we go back and forth.
Them - "Please truncate again."
Us - "Sure, done." (It's easy to get this done quickly since his trx keeps rolling back!)
Them - "Please truncate again."
Us - "All set."
Them - "Please try it again, the truncate does not seem to be working."
Me - Oh, the truncate is not working? Let me get right on that for you.
Us - "Can you tell us what you are trying to run, perhaps we can figure out what it is you are doing that is filling up the log".
Them - "My code has not changed, it must be something else. Are you certain the log is being truncated?"
Dumbass. Let's see...your code has not changed...but you keep filling up the tempdb log file...could it be...the data has changed? Anything? Nothing? After a few more exchanges we get:
Them - "Well, we did migrate this code from a different server, so I guess you could say that something has changed."
Brilliant. I can see how you earned all those letters after your name. Too bad none of them contain "DBA".
Okay, so now we know that something has changed. Oh, and as it turns out, they may or may not be using the right version of the migrated code. And certainly the data is different, but why should that matter, right?
Thankfully I have no shortage of incidents like this one to keep me amused. It must be one of the reasons I come to work every day, the entertainment value here has got to be as good as anywhere else.
I was reviewing procedures from one of our dev teams today when I saw this (names to protect someone and there were a bunch more parameters, and yeah, I doubt this, as typed would compile, work with me):
CREATE PROC
@PK int
@Val1 varchar(50) OUTPUT
@Val2 datetime OUTPUT
@Val3 int OUTPUT
AS
IF EXISTS (SELECT 1 FROM Table1 WHERE pk = @PK)
BEGIN
SELECT @Val1 = Val1 FROM Table1 WHERE PK = @PK
SELECT @Val2 = Val2 FROM Table 1 WHERE PK = @PK
SELECT @Val3 = Val3 FROM Table 1 WHERE PK = @PK
END
ELSE
BEGIN
RAISERROR(‘We don’t have any rows that match that pk’)
END
If I could guarantee a jury of DBA’s, I know I’d be acquitted… Instead, I rewrote it and tried to educate them
SELECT @Val1 = Val1
,@val2 = Val2
,@val3 = Val3
FROM Table1
WHERE PK = @PK
And Val3< > 42
IF @@ROWCOUNT = 0
RAISERROR(‘We don’t have any rows that match that pk’)
END
The real query went from more than a second to less than 50ms and the reads against the table went from several hundred to 2. All that and I still have to justify to developers why they send their code through for review…
Working in a dot.com as a dba. This particular dot.com was extremely concerned with getting as much code out the door as humanly possible. They weren’t terribly concerned with the quality of the code or any of the data. We were collecting over a gb of data every day in SQL Server 7.0. To say the least, the system required serious babysitting, monitoring disk space, log space, growth, etc. The DBA’s spent a considerable amount of time, figuring out how best to monitor & respond to the crazed behavior we got in the system. But that didn’t sit well with the business majors running the company (“all from top schools”).
So we got the word one day, stop spending time messing with maintenance and spend more time churning out code with the developers. Less than a week later, one night a strange spike in activity (right after a new code release that we didn’t monitor) filled the log drive. We tried recovering the database, but nothing worked. We spent three straight days in the office, first attempting to get all the data back, then restoring an older backup, then fixing the code that kept breaking the system. After three days without sleep, we had the system back online. Our glorious kid managers didn’t send a thank you or an atta-boy, but they did send an email instructing us to begin monitoring the systems, especially during and after code releases.
All three dba’s quit within three months of that incident and the company went the way of the dodo less than a year later.