«Paper 022-2007 %STPBEGIN: How Enterprise Guide® Almost Removed the L-word from My Relationship With SAS® Rupinder Dhillon, Dhillon Consulting Inc., ...»
SAS Global Forum 2007 Applications Development
%STPBEGIN: How Enterprise Guide® Almost Removed the L-word
from My Relationship With SAS®
Rupinder Dhillon, Dhillon Consulting Inc., Toronto, ON
Peter Eberhardt, Fernwood Consulting Group Inc., Toronto, ON
Somewhere in the SAS Enterprise Guide documentation we read how easy it was to create a SAS stored process write and test your code, add a simple %stpbegin and %stpend and let SAS Enterprise Guide take care of the rest.
Presto, a new stored process ready to conquer the enterprise. Although that might work in the Hello World example, our first stored process project proved that applying the same guidelines to a more complex SAS program was much more challenging. In reality, creating stored processes is more involved than adding the basic syntax, input parameters and macro variables. You must also be aware of the differences that exist between SAS Enterprise Guide and PC SAS sessions, the importance of efficient coding, and the distinctions between the Workspace and Stored Process servers. This paper will step through some “gotcha’s” that may stump you when converting existing code or creating new stored processes from scratch.
THE PROBLEMOur story starts with complex set of financial forecasting models developed by some very clever mathematicians and statisticians. Our job was to convert the SAS code in these models into more generalized programs where all the ‘hard coded’ values, and there were lots of them, were converted to macro variables so the model could be more easily and consistently run by the modelers.Centralizing the parameters as global macro variables would provide two benefits. First, it would allow the modelers to change the parameters without having to search through the thousands of lines of SAS code for each of the values they had hard coded. Secondly, it would help to formalize and document the model.
This initial conversion was successful; the model became completely data driven. Using PC SAS to access data on the network, the modelers could turn around forecast requests in well under an hour where previously it took the better part of a day. However, there were a large number of parameters and a very good understanding of the model was required to correctly assign the macro variables before running the model. Moreover, a number of these parameters were set up to control output so the modelers could validate the run, and to allow data over-rides for more complex scenarios.
With the initial conversion in place and the demand for model results increasing, the decision was made to make the model widely available; business analysts across the organization would have access to the model directly, thus freeing the modelers from the task of running models allowing them to return to the task of developing models. The strategic decision of the organization was to move to the SAS BI architecture and to have SAS Enterprise Guide be the tool of choice to access and analyze data, so the decision was made to convert the model into a SAS Stored Process. Before we describe the issues we encountered in converting the model, let’s look at the SAS BI architecture.
SAS BI ARCHITECTUREThe SAS Intelligence Platform architecture is designed to efficiently access large amounts of data, while simultaneously providing timely intelligence to a large number of users. The platform uses an n-tier architecture that enables you to distribute functionality across computer resources, so that the resources that are best suited to the job
You can easily scale the architecture to meet the demands of your workload. For a large company, the tiers can be installed across a multitude of machines with different operating systems. For prototyping, demonstrations, or very small enterprises, all of the tiers can be installed on a single machine.
The architecture consists of the following four tiers:
DATA SOURCESData sources store your enterprise data. All of your existing data assets can be used, whether your data is stored in relational database management systems, SAS tables, or ERP system tables.
SAS SERVERSSAS servers perform SAS processing on your enterprise data. Several types of SAS servers are available to handle different workload types and processing intensities. The software distributes processing loads among server resources so that multiple client requests for information can be met without delay.
MIDDLE TIERThe middle tier enables intelligence data and functionality to be surfaced to users via a Web browser. This tier provides Web-based interfaces for report creation and information distribution, while passing analysis and processing requests to the SAS servers.
CLIENT TIERThe client tier provides users with desktop access to intelligence data and functionality through easy-to-use interfaces. For most information consumers, reporting and analysis tasks can be performed with just a Web browser.
For more advanced design and analysis tasks, SAS client software is installed on users’ desktops.
Now that we’ve set the stage, we’re ready to take a closer look at what’s involved in creating your first Stored Process.
A TALE OF TWO SERVERS – STORED PROCESS SERVER VS. WORKSPACE SERVEROne of the first things that you should decide before you get started is whether your Stored Process will run on Workspace server or a Stored Process Server. In our case, we knew that stored processes could be run on either, but without understanding the more intricate differences between the two, we really didn’t foresee how running on one versus the other would affect the way in which the Stored Process should be coded and developed. It is therefore important to understand the differences between the two before deciding on one over the other and especially before you start your coding.
Before we jump into some of the differences between the Workspace and Stored Process servers, it is worthwhile to note the mechanics of these two servers in the context of SAS’ Integrated Technologies. Both servers are execution servers and allow for distributed processing through various clients – in plain English, both servers allow you to run your SAS programs from different clients (ie.SAS Enterprise Guide, Excel, a web interface) without actually needing SAS on your local computer. You may also hear them commonly referred to as IOM (Integrated Object Model) Servers. Your program of choice, in our case SAS Enterprise Guide, communicates with an IOM Spawner, which in turn connects to either server and allows you to run your SAS code.
THE OBJECT SPAWNERThe object spawner controls your link to either of the SAS servers. When you run a stored process or any SAS code, the client submits a request to the Spawner which will then either start up a Workspace server session or connect you to an existing Stored Process server session. The Object Spawner is an important component of this scalable architecture since it essentially acts as a gatekeeper, controlling how many sessions are started or how many processes are running on a single server session. The Object Spawner can also be configured to know when to start up a new Stored Process Server versus directing client requests to an existing one. All server connections, and subsequent disconnects are therefore controlled through the Object Spawner which will close and essentially clean up each session once your Stored Process has completed. Here is where the first main difference between the two servers comes in: in the case of the Workspace server, each time the spawner connects, it launches a new server process or SAS session which is dedicated to the SAS process that you are running. In contrast, when running on a Stored Process server you are connecting to a single, continuous SAS session, which you are sharing with other Stored Processes.
SAS Global Forum 2007 Applications Development
AUTHENTICATIONThe idea of permissions and authentication presents the next major difference between these two servers. We know that one of the basic advantages of the SAS 9 Metadata architecture is the ability to enforce centralized control of data by setting user permissions around data libraries and data access points. Stored Processes also have similar permission settings, which allow you to control who is able to run them. When running Stored Processes on the workspace server, the user’s credentials are passed through to the server and are then used to authenticate access permissions. Since credentials are passed at a user level, you are able to put more flexible controls around the data that a person can access, and the stored processes that the person can run. Under the Stored Process server, the user is connecting to a session that is available to numerous clients and users, and is therefore connecting using a general umbrella SAS user account, usually SASSRV. Consequently, this general SASSRV user id must have fairly wide access privileges. This can cause obvious security concerns when it comes to surfacing sensitive data since you lose the ability to restrict access to different groups or users.
You are able to control the access that different users have by setting the appropriate permissions. Defining permissions allows you to put various restrictions around all of your data and processes such as who can read the data, write to the data source locations, as well as see and change the underlying metadata about your source or process. Furthermore, you are able to put different restrictions on the various data elements. For example, some sensitive data (ie. Employee salaries) should not be readily available to everyone in an organization. You would more than likely want to restrict salary data access to Senior Management while HR should still be able to access other common elements of that data such as the number of active employees. The Metadata architecture allows you to drill down on the levels of the data and set the appropriate permissions on anything that has been defined as a Metadata Object.
In order to facilitate this level of control, your Administrator must define each of your end-user IDs in the Metadata as a ‘User’. In addition to defining these individual Users, you can also define common users as a ‘Group’. This group will share similar permissions and access and are most likely part of the same division or team within an organization.
Once the user IDS are defined in the metadata, you can grant access to these folks so that they may run your Stored Process. As we mentioned earlier, at the moment granting access to your Stored Process to Users and Groups is only possible through the Workspace Server. Future releases may allow the Stored Process Server to authenticate using the User credentials but for now it uses the SASSRV ID.
The following figures depict the process flow involved in running a stored process: (1) The Client Tools will connect to the Metadata Server to authenticate the Users’ permissions; (2) Once authenticated, the client will communicate with the Object Spawner; (3) The Object Spawner will then launch the appropriate server session or connect to an existing one.