This page contains answers to the ASE quiz questions from 2004.
More ASE Quiz questions: 2010 - 2009 - 2008 - 2007 - 2006 - 2005 - 2004 - 2003 - 2002 - 2001
June 2004
As a DBA, you may need to determine which tables in a database contain an identity column.
What is the quickest way to determine programmatically (i.e. not through sp_help) whether a table contains an identity column?
Answer:
Depending on the ASE version, there are different ways of doing this.
To start with ASE 12.5.0.3 or later, just use the next_identity(table_name) built-in function. When it returns NULL, the table does not have an identity column, or does not exist, or the specified name is not a table. When it returns any non-NULL value, it means the table exists and has an identity column:
-- ASE 12.5.0.3+
-- the basic idea for a single table:
if next_identity(@table_name) != NULL
print "This table has an identity column"
else
print "This table does not have an identity column"
-- find all tables with an identity col. in this database:
select name from sysobjects
where next_identity(name) != NULL
In ASE pre-12.5.0.3, you'll have to query the syscolumns table directly as follows:
-- ASE pre-12.5.0.3
-- find all tables with an identity col. in this database:
select object_name(id)
from syscolumns
where status & 128 = 128
Now, why would you need to do a thing like this? Well, as a DBA, tables with an identity column deserve your special attention: you may want to check whether the identity_gap property has been set, for example. Or you may want to run a stored procedure directly after ASE startup to check all tables with an identity column to see whether an identity gap would occur when a row is inserted -- and if so, fix the gap before it strikes your applications, with the procedure sp_idgap_repair (downloadable from here; for ASE 12.5.1+ only).
On a side note, did you know that you don't actually need to know the column name of an identity column in order to use it in queries? An identity column may always be referred to with the column name syb_identity; this works because a table cannot have more than one identity column at a time. For an example of how syb_identity can be used, see the code of sp_idgap_repair mentioned above.
May 2004
For a change, this month's quiz question is about Replication Server instead of ASE (mostly because I've been doing a lot of RepServer work lately while working on a yet-to-be-announced RepServer QuickRef guide (stay tuned!)).
RepServer configuration parameters are stored in the RSSD table rs_config. For a RepServer-wide config parameter (like memory_limit or num_threads), the normal way to change its value is through the RepServer command configure replication server. Alternatively, a setting can be changed through the RSSD procedure rs_configure, which is equivalent to manually updating the corresponding row in rs_config directly with a SQL query.
The question: which RepServer-wide configuration parameters in RS 12.6 should not be changed with rs_configure or by a direct manual update to rs_config, but only through configure replication server?
Answer:
First, a bit of background: RepServer 12.6 includes the new feature of the 'embedded RSSD' (ERSSD), where the RSSD is hosted by an ASA (SQL Anywhere) server instead of an ASE server.
The main advantage of using ASA for the RSSD is that it requires less DBA involvement than ASE. For example, RepServer (re)starts and stops the ASA server automatically when needed, and RSSD backups are scheduled and performed automatically. This should be especially useful when there isn't much ASE expertise around, for example in places where RepServer is used to replicate between non-ASE databases only.
Using ASA for the RSSD is an option when creating a new RepServer; the default still is an RSSD in ASE.
While the tables and stored procedures in the RSSD, and most RepServer functionality*, are identical for both the classic ASE RSSD and the new ASA ERSSD, there is a difference for the ERSSD-related configuration parameters.
When you've chosen the ERSSD, RSSD backups are scheduled and performed automatically. These backups can be configured with the RepServer config parameters erssd_backup_dir, erssd_backup_start_date, erssd_backup_start_time and erssd_backup_interval. These parameters exist only in the ASA ERSSD, and not in an ASE RSSD.
Upon closer inspection, it appears that the ERSSD backup scheduling and execution is performed by ASA. To be precise, the ERSSD backup scheduling information in rs_config is also stored in the ASA system tables SYSEVENT and SYSSCHEDULE. When changing one of the ERSSD config parameters with configure replication server, not only the values in rs_config are updated, but also those in SYSEVENT and SYSSCHEDULE.
However, it appears that these ERSSD parameters cannot be changed rs_configure:
1> rs_configure erssd_backup_interval, '12 hours'
2> go
Parameter 'erssd_backup_interval' Cannot be found or accessed through
'rs_configure'. Check your spelling or try 'configure connection' or
'configure database'
(Incidentally, note that the error message does not suggest the only correct action, namely using configure replication server)
Of course, nothing stops you from updating rs_config directly with a manual SQL query. However, sooner or later you'll find out that those changes have not taken effect, because they were never applied to the ASA system tables SYSEVENT and SYSSCHEDULE.
What appears to happen under the hood is that configure replication server not only updates rs_config, but -only for the ERRSD config parameters- also SYSEVENT and SYSSCHEDULE. When updating rs_config manually, these ASA tables are not affected, so you'd need to update them manually as well -- obviously, using configure replication server is more convenient.
This all means that the ERSSD config parameters are different from other RepServer-wide config parameters in a number of ways. Not only are they fully dynamic, and should they be changed only through configure replication server, but their existence in rs_config is merely a front-end to the ASA scheduling mechanism, and the rs_config are essentially read-only copies of the values in the ASA tables.
I suspect -but I'm guessing now- that this is the background of a puzzling statement in the release bulletin for RS 12.6, saying "rs_configure system procedure no longer supported" in RS 12.6. The release bulletin states that configure replication server must be used instead, but does not give any further explanation why rs_configure is suddenly not supported anymore. I have not found other reasons why rs_configure should be abandoned -- but as always, I'm open to your suggestions and experiences.
*In RS 12.6, you cannot create outgoing routes when the RepServer uses an ASA ERSSD. Since a route requires some RSSD tables to be replicated, a RepAgent of some kind would be needed for the ASA ERSSD database -- my guess is that this ASA RepAgent isn't yet available or something; hence this limitation. As far as I know, this is the only limitation in the RepServer functionality when using an ASA ERSSD instead of a classic ASE RSSD.
April 2004
Applications in your ASE server are suddenly suffering from heavy locking problems. As it turns out, the culprit is a new application, developed by an external contracting company, that has just been taken into production. This application appears to perform long-running updates on the most frequently used database tables. The exclusive locks held by these update statements block other users trying to access these tables.
Being the DBA, you're called into a meeting with your manager, the external contractor's manager and their developers to discuss a solution. The developers propose to use transaction isolation level 0 for the update statements performed by their application, so that blocking locks will not occur.
How should you respond to this suggestion?
Answer:
First, note that it's rare for developers to suggest a change to their own application in order to solve a concurrency problem involving that same application. In my experience, the typical reaction is to blame other applications or the database software itself, and suggest the DBA somehow sort it all out.
Nevertheless, the suggestion to use isolation level 0 for updates is useless. Transaction isolation level 0 only affects read operations (like select statements), by allowing to read through exclusive locks held by other sessions. Any data modifications will still use exclusive locks, irrespective of any isolation levels that may have been set. In other words, isolation level 0 will not reduce the exclusive locks caused by the updates at all.
Now that we're looking at isolation level 0, it would actually be logical to consider a related approach. What if isolation level 0 would be used for all sessions performing select statements? This would allow those selects to read through the exclusive locks, thus solving the locking problems, right?
Well, first note that this approach will not do anything to improve locking problems between two update statements, for the reason already mentioned.
Although level 0 indeed has a concurrency-improving effect on select statements, this has some serious potential drawbacks as well. First, queries on isolation level 0 have more strict requirements for using a unique index, meaning that an index that would normally be the most efficient, may not be used with isolation level 0. Also, queries running at level 0 may be aborted and restarted half-way when data is changed under the query's feet. In addition, level 0 queries will access copies of pages being modified by other sessions rather than access those pages directly, thus consuming more cache memory.
In other words, performance may actually get worse by using level 0. And we haven't even discussed the potential problems as a result of reading modified, uncommitted data...
As an ASE DBA, of course you know better than these flawed suggestions. Instead, the problem seems related to the way those updates are being performed: either they update too many rows, or use inefficient query plans, or both.
Your response should therefore be that the newly introduced application should be closely examined, and changed where necessary. Be careful before volunteering to be involved in this exercise, since this could amount to solving someone else's mess!
If the tables being updated use allpages locking, some concurrency improvement may be possible by changing to datapages or datarows locking, but this is unlikely to really fix the problem of inherently heavy updates. Better try getting the problem fixed at the root first!
The idea of using level 0 for DML is not hypothetical: I've actually seen this being proposed in situations similar to the one described in this question. However, transaction isolation levels appear to be a difficult topic, and more often than not, they are used incorrectly (transaction isolation level 3, a.k.a. 'holdlock' is another example). In my experience, non-standard isolation levels are hardly ever the solution to anything, but rather a source of new problems.
Lesson learned:
transaction isolation levels are often misunderstood. Think twice before using this as a solution for a database problem.
March 2004
A feature requested by a number of ASE users is a way to grant permission for using the truncate table command to other users. Currently, truncate table can only be executed by the table owner, and there is no documented way to grant this command.
Nevertheless, a workaround exists to overcome this limitation. How?
Answer:
In fact, there are even two workarounds.
The first workaround is to grant replication_role to a login (i.e. grant role replication_role to jsmith). When a login has replication_role, it can execute truncate table on any table in the databases it is allowed to access, although it may not even be allowed to select from those tables.
replication_role is a special system-defined role, which -as the name indicates- is intended for use within the context of Replication Server. When an ASE login has replication_role, it is not only allowed to execute truncate table, but also set triggers on/off, set replication on/off and dbcc gettrunc/settrunc. Indeed, all these commands have a replication-specific purpose (note that these set commands are not even formally documented in the ASE documentation, although they are described in the Replication Server manuals).
This workaround is an all-or-nothing deal: by granting replication_role, truncate table is effectively granted on all tables in each database the login is allowed to access. It is not possible to restrict the permission to a specific table. Therefore, be careful when using this workaround - you may want to use auditing to monitor truncate table execution (run sp_audit 'truncate', 'all', 'your_db_name', 'on' for all databases you'd like to keep an eye on).
The second workaround is to use the grant-sa_role-to-non-sa-users trick as described at www.sypron.nl/grant_sa.html. With this trick (which I won't discuss in detail here -- please follow the above link for full details), it is actually possible to restrict truncate table permission to specific tables. This can be implemented with a user table to record the tables a user is allowed to truncate and writing a stored procedure (sp_truncate 'table_name' ?) that checks this table to verify whether a truncation attempt is allowed. This is left as an exercise for the reader...
Incidentally, a more fundamental solution seems to be near.
According to http://www.sybase.com/detail?id=1027934, ASE 12.5.2 will include a new feature allowing truncate table to be granted (see Bugid 292330) ; this sounds as if the regular grant/revoke syntax will be supported, i.e. grant truncate table on MyTable to public -- we'll need to wait and see if it is allowed to grant this permission to a role as well.
According to the above web page, ASE 12.5.2 is expected in April/May 2004.
February 2004
A column of the timestamp datatype will always contain a unique value for each row in a database.
However, there is a situation when a timestamp column has the same value for different rows. When?
Answer:
When a table has no timestamp column, it can be added to the table with alter table:
1> create table t (a int)
2> insert t values (1)
3> insert t values (2)
4> go
(1 row affected)
(1 row affected)
1> alter table t add timestamp
2> go
1> select * from t
2> go
a timestamp
----------- --------------------
1 NULL
2 NULL
(2 rows affected)
As the last query shows, the values of the timestamp column are identical for both rows. As soon as a row is updated, the corresponding timestamp column will be assigned a unique value.
The timestamp feature, combined with the tsequal() function is intended for 'optimistic locking' (i.e. instead of locking a row, its timestamp value is remembered by the application; later, it is checked that nobody else has modified it, since this would have altered the timestamp). This works because timestamp columns values are guaranteed to be unique in a database (except as above), so an update to a row can always be detected.
Note that a timestamp column, although its name may suggest differently, has no direct relation to the real-world clock time. The ASE server automatically assigns the current value of the 'database timestamp' to the column when the row is inserted or updated. The database timestamp is essentially a sequential counter without any other purpose than to provide a unique value for every modification in the database.
Incidentally, one could argue that, formally speaking, the timestamp columns don't have the 'same' value, since NULL doesn't equal NULL (due to the three-valued logic in SQL, which stipulates that the conditions NULL = NULL and NULL != NULL are both neither true not false).
While that argument sounds correct, it is interesting that the tsequal() function does consider NULL = NULL to be true:
1> declare @t timestamp
2> select @t = timestamp from t where a = 1
3> select * from t where tsequal(@t, timestamp)
4> go
(1 row affected)
a timestamp
----------- --------------------
1 NULL
2 NULL
(2 rows affected)
Why does tsequal() behave this way? Well, my guess is that tsequal() just wasn't ever expected to deal with such a situation (indeed, this isn't something you'll quickly see in a production system).
January 2004
This question is about a rather simple update. We have the following tables t1 and t2; both have two int columns and contain two rows:
1> select * from t1
2> go
a b1
----------- -----------
0 100
0 200
(2 rows affected)
1> select * from t2
2> go
b2 c
----------- -----------
100 3
100 5
(2 rows affected)
What we want to do is assign the sum of column t2.c to t1.a for matching values in columns t1.b1 and t2.b2. In other words, we want the value 8 (=3+5) assigned to column t1.a for the row where t1.b1 = 100.
The following query is written to accomplish this:
update t1
set a = sum(c)
from t1, t2
where b1 = b2
and b1 = 100
Couldn't be simpler, right? But does this query work correctly?
Answer:
Perhaps surprisingly, this update query updates both rows in t1:
1> update t1
2> set a = sum(c)
3> from t1, t2
4> where b1 = b2
5> and b1 = 100
6> go
(2 rows affected)
1> select * from t1
2> go
a b1
----------- -----------
8 100
8 200
(2 rows affected)
In fact, if t1 had contained more rows, this query would have updated all those rows in the same way, irrespective of the value in their b1 column: the predicates where b1 = b2 and b1 = 100 are completely ignored by this update!
The problem lies in the sum() aggregate in the set clause. When replacing this aggregate by a constant or a variable, the update works as expected: it performs the specified join and update only 1 row (try this yourself!).
However, with an aggregate in the set clause, the update is effectively processed as follows (you can also see this with showplan on):
update t1
set a = (select sum(c)
from t1, t2
where b1 = b2
and b1 = 100)
This is an interesting case because the ASE documentation does not seem to describe this particular situation at all (well, at least I couldn't find it...). However, the docs do say that aggregates are allowed only in the select list or in a having clause, so it feels a bit funny that an update-with-aggregate is accepted by ASE after all (if anyone knows more about the background of this issue, I'll be glad to post it here).
The correct way to write this update is to use a correlated subquery (this is also ANSI SQL-compliant):
update t1
set a = (select sum(c)
from t2
where t1.b1 = b2)
where b1 = 100
Alternatively (and more clumsy), you can split up the query and replace the aggregate by a variable:
declare @s int
select @s = sum(c)
from t1, t2
where b1 = b2
and b1 = 100
update t1
set a = @s
from t1, t2
where b1 = b2
and b1 = 100
Instead of a variable, a noncorrelated expression subquery can also be used (note how the line in bold is essential to avoid updating all rows again):
update t1
set a = (select sum(c)
from t1, t2
where b1 = b2
and b1 = 100)
where b1 = 100 -- essential!
Note that the fipsflagger (i.e. set fipsflagger on) doesn't give any clues to this particular problem. It will generate a warning, indicating that a from clause in an update is non-standard SQL. However, the same warning is generated when replacing the aggregate by a variable, so the fipsflagger isn't particularly useful here.
Lesson learned:
Don't be satisfied just because your update statement seems to run fine. Always check the exact rows and columns being updated!
Go here for a full list of all quiz questions.
|