Friday, December 20, 2013
Friday, December 2, 2011
some SAS written interview questions
These are the i have been asked in writen test for one interview .I thought I should share..
Q2. Write code to create a SAS dataset named “StudentScoreNew” by picking each student’s most recent score. If there are multiple scores for the most recent year, then pick the highest one for that year.
Solution
Q3. Tag on the following variables to “StudentScoreNew” from the above question.
1. HighScore: the highest score of each student. (Use “StudentScore” as the input)
2. AvgScore: Average score for each student. (Use “StudentScore” as the input)
Solution
Output:
Program:
proc means data = studentscore mean max maxdec=1;
class studentid;
var score;
output out=question_3 ;
run;
Q4. Complete the following macro.
%macro processyear(Inputds=StudentScore, Year=91,Outputds=Year_out);
%mend;
Q4. Complete the following macro.
%macro processyear(Inputds=StudentScore, Year=91,Outputds=Year_out);
%mend;
Year _out should have the following variables:
Year- Year specified in the input.
StudentHighScore- “StudentID “of the student with the highest score for that year.
YearAvg- Average for the input year.
Q1. Create a variable called “Flag” which indicates whether a student’s score increased or decreased from the previous record in the data. Mark a “0” for records where the student’s score was lower than the previous record. Conversely, mark a “1” for records where a student’s score was equal or higher than the previous record. Mark a “0” for the first record of each student
Output:
Obs Studentid year score flag
1 A 91 400 0
2 A 92 398 0
3 A 92 399 1
4 B 91 430 0
5 B 92 432 1
6 B 93 444 1
7 B 94 446 1
8 C 91 455 0
9 C 92 423 0
10 C 93 411 0
11 C 94 415 1
12 C 95 427 1
13 C 95 418 0
Program for question 1:
data studentscore ;
input Studentid $ year score;
cards;
A 91 400
A 92 398
A 92 399
B 91 430
B 92 432
B 93 444
B 94 446
C 91 455
C 92 423
C 93 411
C 94 415
C 95 427
C 95 418
;
run;
data new;
set studentscore;
by studentid;
first=first.studentid;
last=last.studentid;
x=lag(score);
if first=1 then y = -1; else y=score-x;
if y>=0 then flag=1; else flag=0;
run;
proc print ;
var studentid year score flag;
title 'flags for Question one';
run;
Solution
Output:
Recent_ Max_
Obs Studentid year score
1 A 92 399
2 B 94 446
3 C 95 427
Program:
PROC SQL ;
create table studentscorenew as
select unique studentid, year as Recent_year, max(score)as Max_score
from studentscore where year
in (select max(year) from studentscore group by studentid)
group by studentid
having year = max(year);
quit;
proc print ;
title 'most recent maximum score ( Question two)';
run;
1. HighScore: the highest score of each student. (Use “StudentScore” as the input)
2. AvgScore: Average score for each student. (Use “StudentScore” as the input)
Solution
Output:
N
Studentid Obs Mean Maximum
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
A 3 399.0 400.0
B 4 438.0 446.0
C 6 424.8 455.0
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒProgram:
proc means data = studentscore mean max maxdec=1;
class studentid;
var score;
output out=question_3 ;
run;
%macro processyear(Inputds=StudentScore, Year=91,Outputds=Year_out);
%mend;
Solution:
We have used Proc sql to design macro &outputds refer to table name , &year refers to numeric number of year and &Outputds refers to saving table by that particular name.
Output
Student
High
Obs Studentid year AVGSCORE Score
1 C 91 428.333 455
Macro Program:
……………
%macro processyear(Inputds=, Year=,Outputds=);
PROC SQL ;
create table &Outputds as select studentid,year,avg(score)AS AVGSCORE ,max(score) AS StudentHighScore
from &Inputds where year=&year having score=max(score);
quit;
proc print;
run;
%Mend processyear;
……………………….
Call this Macro:
………….
%processyear(Inputds=StudentScore, Year=91,Outputds=Year_out);
……………..
%macro processyear(Inputds=StudentScore, Year=91,Outputds=Year_out);
%mend;
Year _out should have the following variables:
Year- Year specified in the input.
StudentHighScore- “StudentID “of the student with the highest score for that year.
YearAvg- Average for the input year.
Q5. Briefly describe what the following macro does:
%macro Glue(InputFile=,InputSheet=);
PROC IMPORT OUT= WORK.Master
DATAFILE= "&inputfile."
DBMS=EXCEL REPLACE;
SHEET="&inputsheet";
GETNAMES=YES;
RUN;
Proc Sort data=master;
by order;
run;
Data _null_;
set master;
by order;
call symput('Program'||'_'||strip(put(_N_,8.)),Program);
call symput('Type_of_prog'||'_'||strip(put(_N_,8.)),Type);
call symput('Inputlist'||'_'||strip(put(_N_,8.)),InputList);
call symput ('Location'|| '_'||strip(put(_N_, 8.)),Location);
call symput ('number_of_progs',_N_);
run;
%local a;
%do a= 1 %to &number_of_progs. ;
%include "&&Location_&a." ;
%if %upcase(&&Type_of_prog_&a.) eq MACRO %then %do;
%let program=&&Program_&a.;
%let inputlist=(&&Inputlist_&a);
%str(%&program. &inputlist.);
%end;
%end ;
%mend;
Sample excel sheet for the above program:
|
Solution;
Briefly,In this program there is a macro named Glue which reads the specified worksheet saved in computer and identifies the rows which have type MACRO and include that file specified in worksheet like conversion.sas and compressioncounts.sas to run the macro specified at that location..
This program can be used to call multiple macros in single programs by specifying location of worksheet .in this worksheet there are locations of macros .
Friday, November 11, 2011
some general sas questions-part-1
These are the first of the series of the general sas language questions i am posting.
Q1#What are the components of the sas language?
These are the following components of SAS language
1.sas files
2.sas datasets
3.external files
4.DBMS FILES
5.SAS language elements:(data steps and proc steps)
6.sas macro facility
Q2:Different types of sas sessions
1#SAS windowing environment
2#Interactive line mode
3#Noninteractive mode
4#Batch (or background) mode
5# Objectserver mode(sas runs as an IOM server)
Q3#How can you avoid specifying option everytime you invoke sas.
By placing SAS system options in a configuration file,
you can avoid having to specify the options every time that you invoke SAS.
in other words using defaults options.
Q4#HOW many ways SAS allows you to access your input data remotely
Ans;In the following ways:
1.SAS catalog :specifies the access method that enables you to reference a SAS catalog as an external file.FTP specifies the access method that enables you to use File
Transfer Protocol (FTP): to read from or write to a file fromany host computer that is connected to a network with an FTP server running.
2.TCP/IP:socket specifies the access method that enables you toread from or write to a Transmission Control Protocol/Internet Protocol (TCP/IP) socket.
3.URL: specifies the access method that enables you to use the uniform resourcelocator (URL) to read from and write to a file from any host computer that is connected to a network with a URL server running.
Q5#What are the different sources of input data in your sas program.
1# SAS datasets (sas datasets sas views)
2# Raw data (external files and instream data)
3# Remote access:allows you to read input data from nontraditional sources
such as a TCP/IP socket or a URL. SAS treats this data as if it were coming from an external file.
Q6#How many and what are the types of missing values
numeric
character
special numeric
Q7#How to Check for Missing Values in a DATA Step
You can use the N and NMISS functions to return the number of nonmissing and missing values,
respectively, from a list of numeric arguments.
When you check for ordinary missing numeric values, you can use code that is similar to the following:
if numvar=. then do;
If your data contains special missing values,
you can check for either an ordinary or special missing value with a statement that is similar to the following:
if numvar<=.z then do;
To check for a missing character value, you can use a statement that is similar to the following:
if charvar=’ ’ then do;
The MISSING function enables you to check for either a character or numeric missing value, as in:
if missing(var) then do;
In each case, SAS checks whether the value of the variable in the current observation satisfies
the condition specified. If it does, SAS executes the DO group.
Q8#The following are types of SAS constants(literals):
character
numeric
date, time, and datetime
bit testing.
Q9#How many types of error are there in sas
1.syntax error
Syntax errors occur when program statements do not conform to the rules of the SAS language.
2.semantic error
Semantic errors occur when the form of the elements in a SAS statement is correct,
but the elements are not valid for that usage.
3.Execution type error
Execution-time errors are errors that occur when SAS executes a program that processes data values.
4.out-of-resource-condition
An execution-time error can also occur when you encounter an out-of-resources condition,
such as a full disk, or insufficient memory for a SAS procedure to complete.
5.Data errors
Data errors occur when some data values are not appropriate for the SAS statements
that you have specified in the program. For example, if you define a variable as numeric,
but the data value is actually character,
6.Macro related errors
macro compile time and macro execution-time errors, generated when you use the macro facility itself
errors in the SAS code produced by the macro facility.
Q10#What to do If you want processing to stop when a statement in a DATA step has a syntax error?
you can enable SAS to enter syntax check mode. You do this by setting the SYNTAXCHECK system option in batch or non-interactive mode, or by setting the DMSSYNCHK system option in the windowing environment.
Q11#How to make sas process multiple errors
You can use the ERRORABEND system option to do this.
Q1#What are the components of the sas language?
These are the following components of SAS language
1.sas files
2.sas datasets
3.external files
4.DBMS FILES
5.SAS language elements:(data steps and proc steps)
6.sas macro facility
Q2:Different types of sas sessions
1#SAS windowing environment
2#Interactive line mode
3#Noninteractive mode
4#Batch (or background) mode
5# Objectserver mode(sas runs as an IOM server)
Q3#How can you avoid specifying option everytime you invoke sas.
By placing SAS system options in a configuration file,
you can avoid having to specify the options every time that you invoke SAS.
in other words using defaults options.
Q4#HOW many ways SAS allows you to access your input data remotely
Ans;In the following ways:
1.SAS catalog :specifies the access method that enables you to reference a SAS catalog as an external file.FTP specifies the access method that enables you to use File
Transfer Protocol (FTP): to read from or write to a file fromany host computer that is connected to a network with an FTP server running.
2.TCP/IP:socket specifies the access method that enables you toread from or write to a Transmission Control Protocol/Internet Protocol (TCP/IP) socket.
3.URL: specifies the access method that enables you to use the uniform resourcelocator (URL) to read from and write to a file from any host computer that is connected to a network with a URL server running.
Q5#What are the different sources of input data in your sas program.
1# SAS datasets (sas datasets sas views)
2# Raw data (external files and instream data)
3# Remote access:allows you to read input data from nontraditional sources
such as a TCP/IP socket or a URL. SAS treats this data as if it were coming from an external file.
Q6#How many and what are the types of missing values
numeric
character
special numeric
Q7#How to Check for Missing Values in a DATA Step
You can use the N and NMISS functions to return the number of nonmissing and missing values,
respectively, from a list of numeric arguments.
When you check for ordinary missing numeric values, you can use code that is similar to the following:
if numvar=. then do;
If your data contains special missing values,
you can check for either an ordinary or special missing value with a statement that is similar to the following:
if numvar<=.z then do;
To check for a missing character value, you can use a statement that is similar to the following:
if charvar=’ ’ then do;
The MISSING function enables you to check for either a character or numeric missing value, as in:
if missing(var) then do;
In each case, SAS checks whether the value of the variable in the current observation satisfies
the condition specified. If it does, SAS executes the DO group.
Q8#The following are types of SAS constants(literals):
character
numeric
date, time, and datetime
bit testing.
Q9#How many types of error are there in sas
1.syntax error
Syntax errors occur when program statements do not conform to the rules of the SAS language.
2.semantic error
Semantic errors occur when the form of the elements in a SAS statement is correct,
but the elements are not valid for that usage.
3.Execution type error
Execution-time errors are errors that occur when SAS executes a program that processes data values.
4.out-of-resource-condition
An execution-time error can also occur when you encounter an out-of-resources condition,
such as a full disk, or insufficient memory for a SAS procedure to complete.
5.Data errors
Data errors occur when some data values are not appropriate for the SAS statements
that you have specified in the program. For example, if you define a variable as numeric,
but the data value is actually character,
6.Macro related errors
macro compile time and macro execution-time errors, generated when you use the macro facility itself
errors in the SAS code produced by the macro facility.
Q10#What to do If you want processing to stop when a statement in a DATA step has a syntax error?
you can enable SAS to enter syntax check mode. You do this by setting the SYNTAXCHECK system option in batch or non-interactive mode, or by setting the DMSSYNCHK system option in the windowing environment.
Q11#How to make sas process multiple errors
You can use the ERRORABEND system option to do this.
Thursday, November 10, 2011
somethings intersting about proc report
Here are the few interesting questions about the proc report.Please add and post if you find couple of more interesting questions about proc report......
#What is the default usage and statistic for numerical variable
Numerical variable is analysis and default statistics is sum
#How to wrap lines of text when you have long values;
FLOW option in define statement
note:split option is must;#Why SPLIT option is required
To tell the program that you want to split the comments between words(blanks)
otherwise it will use other characters as slashes in dates as possible line break;
#How to right align and left align column
RIGHT option
#How the change the order of rows according to some variable
ORDER option in DEFINE statement
#How to create a multipanel report
with PANELS option in proc report statement;
#What are the options used in break statement
ol summarize skip supress
# Can you compute new variable in proc report ?
yes COMPUTE statement
don't forget ENDCOMPUTE statement before run statement
#What about new character variable
Use CHARACTER keyword in compute statement
#How to create a line after heading
healine option in proc report
#What is the default usage and statistic for numerical variable
Numerical variable is analysis and default statistics is sum
#How to wrap lines of text when you have long values;
FLOW option in define statement
note:split option is must;#Why SPLIT option is required
To tell the program that you want to split the comments between words(blanks)
otherwise it will use other characters as slashes in dates as possible line break;
#How to right align and left align column
RIGHT option
#How the change the order of rows according to some variable
ORDER option in DEFINE statement
#How to create a multipanel report
with PANELS option in proc report statement;
#What are the options used in break statement
ol summarize skip supress
# Can you compute new variable in proc report ?
yes COMPUTE statement
don't forget ENDCOMPUTE statement before run statement
#What about new character variable
Use CHARACTER keyword in compute statement
#How to create a line after heading
healine option in proc report
Tuesday, November 8, 2011
proc means
Here are the couple of question and there answere proc means;
Please add questions and/or ask any questions which are related to proc means.
#What is the difference between class and by statement
you don't have to sort the data before using class variable
#Proc means vs proc summary
you don't need noprint option in proc summary .it is default
# How will you find number of missing observations in proc means
using NMISS in OUTPUT statement
# What is the autoname used in output statement.
label names are automatically generated with outoname
#We know that grand mean is default with class statement how to get rid of grand mean
with NWAY in PROC MEANS statement
# How you will compute the means and number of nonmissing values of two variables a and b
for each combination of other two variables x and y.
have to use multiple class variables with
class x y;
var a b;
#How do you test for missing values?
Use PROC MEANS to count the missing values for each variable
proc means data=sasdsn nmiss noprint;
output out=missdsn(drop=_type_ _freq_) nmiss=;
run;
Monday, November 7, 2011
Proc Print;Something Interesting
Q1.How to omit the obs columns and replace with say variable empid,
id empid;?
Ans:
#you can sort the data with sort statement but you don't want the original dataset changed.
out= option; e.g out =newdata;
#breakdown your listing
with BY statement
#to do totals and subtotals
with SUM statement
#you don't want to repeat the id variable in first column
use the same variable in by and var statement
by id
var id
#how to print the variable names horizantly
HEADING=horizontal option
_______________________
proc print has one property by which one can print observations of last created dataset only
by default.e.g.
proc print;
run;
using this property one can find -Is the dataset is created or dataset is modified?
________________________
id empid;?
Ans:
#you can sort the data with sort statement but you don't want the original dataset changed.
out= option; e.g out =newdata;
#breakdown your listing
with BY statement
#to do totals and subtotals
with SUM statement
#you don't want to repeat the id variable in first column
use the same variable in by and var statement
by id
var id
#how to print the variable names horizantly
HEADING=horizontal option
_______________________
proc print has one property by which one can print observations of last created dataset only
by default.e.g.
proc print;
run;
using this property one can find -Is the dataset is created or dataset is modified?
________________________
Subscribe to:
Posts (Atom)