Beginning a SIF Implementation
The intent of this document is to serve as a guide for school organizations wanting to start a SIF implementation with the goal of implementing Identity Management or a VLE in a number of schools and include data from:
- elementary schools
- secondary schools
- secondary school consortiums (where the SIS suppliers may or may not be the same)
- special education schools (those who would have a SIS-like application that would keep track of its students)
- schools with good quality data and those with bad quality data
This guide is going to assume the use of Visual Software products. Although some parts of this guide may be generic (such as in SIF agents and the Zone Integration Server (ZIS)), other parts, such as in the use of Veracity, are presently only available from this supplier.
At a very high level, the steps are as follows:
- Install the Zone Integration Server (ZIServer) into the environment
- Set up agents for the SIS systems for the schools that will be participating
- Set up logical instances of Veracity for the schools that will be participating
- Create a set of Veracity rules that:
- match the data quality policies of the top level organization
- illustrate where Envoy will be needed (show where the duplicate records will be)
- (talk about this)
- Have schools clean up data
- Set up Envoy and verify business rules with organization
- Learners (after this, the IdM can start to be set up)
- Teachers (after this, the VLE can start to be set up)
- Continue setting up other agents
1) Installing the Zone Integration Server
In preparing to do this, you will first need to decide if you will be needing redundancy. Now, this can mean many things since the ZIS is made up of a web part and a database part and each of these parts offer different redundancy options. For a complete description of these options and which are supported by our ZIServer product, see ZIServer at the Enterprise Level
If the ZIServer is to be installed on a single server or on a single web server and a shared SQL server, then the install procedure is much more direct.
Single Server Installation (non-redundant)
This is the simplest configuration. IIS and SQL Server are both installed on the same server and the ZIServer.MSI is typically executed with all of the default options (unless you would like to change the default location or web site port number).
NOTE: When you first install SQL Server, you should choose “mixed mode authentication”, or if this is an existing server, you may change it through the SQL Server properties.
Two Server Installation (non-redundant)
If you are using a two-server configuration, you should run the ZIServer.MSI twice: on the database server machine first, then on the web server. Both times, choose to install a custom configuration. When installing on the database server, only choose the database parts and when installing on the web server, choose to not install the database parts.
When you install the web parts, you will still be required to specify the name of the database server – it will need to know this so it can connect the two parts together.
High Availability Installation – Database Part
SQL Server allows its users to “scale up” or (with a lot of work) “scale out”. We highly recommend staying with the “scaling up” model (use a server that allows for growth – extra processor slots), but using one of the following two techniques to provide redundancy:
- SQL database mirroring, high safety mode
- SQL database clustering
We prefer SQL database mirroring over clustering for two reasons:
- The most important reason is that with database mirroring, there is no single point of failure. With SQL clustering, there are only one copy of the databases that are shared between the two machines. With database mirroring, if a block becomes corrupted (and cannot be repaired through the RAID mechanisms), it is automatically repaired from the extra copy of that block from the other copy (from the mirror). As stated in the SQL Server documentation "A failover cluster does not protect against disk failure. You can use failover clustering to reduce system downtime and provide higher application availability."
- Since ZIServer has two databases, we recommend assigning one to each of two servers as its principal server. In this way, we achieve some of the performance benefits of “scaling out” not normally associated with SQL Server.
We do not recommend any high-availability database techniques that involve log shipping, replication or scalable shared databases. The problem with these methods is, especially in busy systems, that the copy can be up to several seconds or more out of date from the original. If messages are lost in between, some SIF agents have a very difficult time recovering.
High-Availability Installation – Web Server Part
In choosing the high-availability method for the web part, you will have a few choices as well – some will depend on your network and the type of connections (HTTP or HTTPS) you need to make. Acceptable choices include :
- Windows Network Load Balancing – this functionality is built into the Windows Server operating system. If you are considering this, we recommend using Windows Server 2008 R2 over one of the earlier versions (it is much easier to set up, it no longer required special hardware and we’ve found it much more reliable as well).
- Windows Clustering – this is sometimes referred to as a “active-passive” setup where one server is continually backing up the other and both servers are sharing a common set of disk drives. There is a version of this known as “geographically dispersed multi-site clustering” used for disaster recovery scenarios that uses multiple data stores, but most of the time, Windows clustering suffers from the disk being a single point of failure.
- Use of hardware Load Balancing Devices – these devices typically handle switching traffic between multiple machines but can also sometimes handle HTTPS encoding as well. These roughly serve the same purpose as the first option, except that the operating system does not need to spend any of its time dealing with the routing of messages. On the other hand, the operating system has no knowledge of how the routing is being done and is sometimes at a disadvantage when making certain types of decisions – there are tradeoffs.
Side note: Do not assume that by putting a Zone Integration Server on a high-availability platform it will automatically become a high-availability ZIS. ZIServer has been designed and programmed for these environments. For example, it is likely that all ZIS will properly receive messages correctly from both SIF pull and push mode agents. Likewise, they will most likely handle sending messages to SIF pull mode agents. But, correctly sending messages to SIF push mode agents, especially when a server goes down and then comes back up again is where most will fail and having hardware or operating system assist will not help at all.
Set Up Student Information System Agents
First, the Student Information System (SIS) SIF agent should be set up before SIF agents for any of the other systems. This is because they are typically the primary feeders for most other systems in the zone.
Ideally, SIF agents would be available from the SIS suppliers and one could be purchased or otherwise obtained from the supplier. The SIF agent serves as an adapter between the application and the rest of the infrastructure and packages the information from the SIS and makes it available to the other applications that are connected to it. Normally, the SIS would send the information whenever it changes through its own SIF agent, but if no SIF agent exists directly from the supplier, we offer two options:
- Mimic – this is a SIF agent wizard that is intended for those applications that can create extract files for objects such as schools, learners, teachers, contacts, classrooms, courses, etc. in CSV format and deposit them into a directory on a regular basis (such as once an hour). The Mimic SIF agent will regularly look at these to see differences in the files between the current file and the last time it looked. It will then generate SIF events corresponding to the differences. It will also respond to SIF requests from other agents in the zone. Below is a simple slide show illustrating Mimic’s configuration and installation process – for more information see Mimic.
- ZIAgent – this is our fully functional, configurable SIF agent made for existing applications. Being substantially different from SIF agent Agent Development Kits (ADKs), ZIAgent allows its user to configure a SIF agent in a matter of hours to days without programming that has sophisticated features such as audit trails, automated code discovery, dynamic record matching, business rules, regression testing and much more. For more information, see ZIAgent.
Connections to these agents should be set up in the ZIS first and the Access Control Lists (ACLs) for the agents should be set up to control the information that they will publish. These lists can be set up in two ways: at the object level and at the element level.
Object Level Control
Object level control gives the administrator over the objects that a SIF agent is allowed to publish or over those to which it is allowed to subscribe. The screen in the ZIServer administration tool that allows control over object permissions looks like this:
The permissions for each object include Provide, Request, Respond, Subscribe, Publish Add, Publish Change and Publish Delete. In this example, the "TestProvider" agent was allowed to Provide, Respond and Publish all types of events for all types of objects in our test environment. In a production environment, it is best to determine which objects will be needed by subscribers in the zone and only publish those zones.
NOTE: It is also recommended shutting off those objects in the SIF agent software, so that it doesn’t continually try to do things that it is not allowed to do by policy.
Element Level Control
Element level control is something that has been recently added to the SIF specification. This allows the administrator to further restrict information being sent to subscribing applications at the element (or attribute) level. On the ZIServer product, certain objects and elements within those objects are pre-selected as “those that would be likely candidates for having restrictions” (this makes the user interface easier to use and runtime processing faster). In the ZIServer ACL interface (as shown above), the names of objects that are likely candidates for element level filtering are shown up as hyperlinks. When selected, another screen is displayed that looks like this:
On the left side of the screen are the names of the elements (or attributes) that can be restricted from being delivered to this agent. If the box is selected, then messages that would normally be delivered to this agent will have these elements removed from them before they are delivered.
NOTE: In ZIServer’s audits, you may notice that a single event message appears one time for each agent it is delivered to – this is because there is a potential that each copy of the message may have slightly different contents because the element level filtering may be set differently for the different agents.
Set up Logical Instances of Veracity
This is where our (Visual Software’s) recommendations and most everyone else’s diverge. Our recommendation is that before you connect an automatic feed to subscribing applications, you know what the data is going to look like and how well it will meet the organization’s standard of quality. This may simply be an exercise to verify that everyone has been meticulously been entering in all data, but more often than not, it proof of what everyone has known all along – the data needs some work.
Setting up Veracity for this purpose is quite simple – at a later point, we will get more elegant with setting up additional user accounts, but at this stage we simply want to get the data in good enough shape so that we can connect the other applications (including identity management) and have a good chance of success.
Setting up Veracity also includes defining business rules that reflect data quality standards for the organization. These can include things such as checking:
- make sure all learners have reasonable birth dates
- make sure all learners have home addresses specified in their demographic records
- make sure all learners have a value for the gender field that is ‘M’ or ‘F’ (not missing or ‘U’ (unknown) – although allowed by the SIF specification, this value may not be acceptable for this organization)
- make sure al learners have a given name, family name
- check for learners who are enrolled in a “English as a Second Language” class whose primary language is English
- check for invalid phone number format
Rules like these may come from local standards or may be the result of requirements of state or census reporting. Furthermore, the checks may be labeled as “Errors”, “Warnings” or “Information Only” to indicate a level of severity attached to it.
Set up Veracity Rules
The rules that will be applied against the school data should, at a minimum, look for the errors in the data that would inhibit the assigning of learner, teacher and contact identifiers.
It might be a good idea to start with a limited set of rules because, from our experience, it is very disheartening for school users if they see that they need to correct 300,000 errors in their data before it can be used (we’ve seen worse than that…). So, we find that it’s best to start with the rules that you need and then add in new ones over time once the users have had a chance to benefit from some of the benefits of the software.
Have Schools Clean Up Data
This will most likely require some training, encouragement, and finding a way to have them see the benefit in doing the work to clean up the data. In the training it will be necessary to show school users that as soon as they make the correction in the SIS, the error drops off the Veracity screen. It might also be good to see a demonstration of SIF in action during the training so that the users sense that the work that is being done is for a purpose other than “the boss says so” (it helps when there is a purpose in what they are doing).
Expect that this process could be one of the longest parts of the project, depending on the initial quality of the data, the aggressiveness of the set of rules and the time that the school is willing to put into correcting the errors.
Connect Other Applications
Once the data has been cleaned at each of the schools, you are now ready to connect other subscribing applications into the zone.
Applications Installed at Schools
If the Student Information System is installed at the district level and there are applications that need to be fed data that are installed independently at schools, then the consolidated data will need to be split dynamically be school. A rare few (if any) Student Information System SIF agents will do this automatically. For environments such as this, we developed a product called Envoy. To learn more about this, see: SIF_Zone_Partitioning_Using_Managed_Virtual_Zones