Creating and Using RefId Values
The RefId is one of the pillars of the SIF specification — its understanding and proper use will make the difference between a successful and efficient implementation and one that will be a continual source of problems and incompatibilities.
RefId values are used to uniquely identify objects. Almost all SIF objects have their own RefId whose value is used to uniquely and permanently identify the object they represent.
When the SIF object is established, the value of the RefId should be created and it should remain constant as long as that SIF object represents that object. So, for example, when a StudentPersonal object is created to hold student demographic and other personal information, the corresponding RefId value should be assigned and should never be changed as long as the student remains enrolled. Other rules include:
(please don't be offended if they appear too simple; they represent real problems)
- Never send a Delete or a Change before an Add for a given RefId (send messages in the proper order)
- Never send an Add or a Change after a Delete for a given RefId (do not reuse RefId values)
- RefId values for a given object should never change - even between school years. An exception to this would be if a student moves from one school to another and his or her records are managed by different systems, then his or her RefId values would be different in the new system. But, if the same system still manages his or her records, then the RefId values should remain the same.
The Assigned Value
The value of a RefId (in SIF 2) should be 32 hexadecimal characters (0−9 and the letters A‑F), no dashes or spaces. This value is also sometimes referred to as a GUID. The value assigned to a RefId should be “globally unique”. The best way to do this is to use a function that generates a GUID (operating systems and the SQL language have built-in functions for generating these). For example, the following value may be given as a default value to a RefId field in a SQL Server database to have them automatically generated:
The SQL Server NEWID() function generates GUID values with dashes in them; the REPLACE() function removes them, and by supplying this as the default value it will be automatically generated and filled in for new objects when the value is not supplied when an INSERT is not done. Using MySQL, the function would look like:
UNHEX(REPLACE(UUID(),' — ',' ') );
Using Oracle, this function is called sys_guid(); in PostGreSQL they have several functions such as guid_generate_v4() which allow you to choose your GUID generation option. Most other modern databases have such functions built in. so there shouldn't be any need for anyone to write new code.
Calculating RefId Values
Ideally, RefId values should be assigned by a function such as NEWID(), mapped to an existing object and maintained in a database. This will always lead to a system that works well, works under all circumstances and works well with other applications from all suppliers (if they also use RefIds properly). RefId values should never be calculated from the value of any data that was originally entered by hand.
For example, we’ve seen a system that calculated a StudentPersonal RefId value by taking the school number and the student ID and applying a function to them to get a 32-character hexadecimal value. The intent was to avoid having to store the values in a database. The problem occurred when the data from two schools were combined into a single database. When the function was applied to the two School A, Student B’s information, the RefId generated matched that from School X, Student Y’s information. (Actually the problem was even more difficult to see than this. It took many days to track down what was happening, but the result was that the combined database was corrupted. The two students' records were being “merged” accidentally).
A possible exception to this might be if the basis for the calculation is something that guaranteed to be 100% unique, all the time in all contexts and if the function for mapping the basis for the calculation to the RefId has been thoroughly tested. Our recommendation is to never do this. Our experience has shown that everyone who has implemented calculated RefIds has been 100% unconvincingly confident that their model is different, it will work out well, and in every case it has worked out terribly.
Keeping Backup Copies
If you do what we suggest – that is, to randomly generate RefId values for new SIF objects and map them to the other objects they represent – it will be very important to make sure that the SIF agent’s mapping tables are backed up on the same schedule as the application’s database. This is done so that if a restore of one is done, the corresponding mapping tables can be done as well.