sbs {ReGenesees} | R Documentation |
The sbs
data frame stores artificial sbs-like sampling data, while sbs.frame
is the artificial sampling frame from which the sbs
units have been drawn. They allow to run R code contained in the ‘Examples’ section of the ReGenesees package help pages.
data(sbs)
The sbs
data frame mimics data observed in a Structural Business Statistics survey, under a one-stage stratified unit sampling design. The sample is made up of 6909 units, for which the following 22 variables were observed:
id
Identifier of the sampling units (enterprises), numeric
public
Does the enterprise belong to the Public Sector? factor
with levels 0
(No) and 1
(Yes)
emp.num
Number of employees, numeric
emp.cl
Number of employees classified into 5 categories, factor
with levels [6,9]
(9,19]
(19,49]
(49,99]
(99,Inf]
(notice that small enterprises with less than 6 employees fell outside the scope of the survey)
nace5
Economic Activity code with 5 digits, factor
with 596
levels
nace2
Economic Activity code with 2 digits, factor
with 57
levels
area
Territorial Division, factor
with 24
levels
cens
Flag identifying statistical units to be censused (hence defining take-all strata), factor
with levels 0
(No) and 1
(Yes)
region
Macroregion, factor
with levels North
Center
South
va.cl
Class of Value Added, factor
with 27
levels
va
Value Added, numeric
(contains NA
s)
dom1
A planned estimation domain, factor
with 261
levels (dom1
crosses nace2
and emp.cl
)
nace.macro
Economic Activity Macrosector, factor
with levels Agriculture
Industry
Commerce
Services
dom2
A planned estimation domain, factor
with 12
levels (dom2
crosses nace.macro
and region
)
strata
Stratification Variable, a factor
with 664
levels (obtained by crossing variables region
, nace2
, emp.cl
and cens
)
va.imp1
Value Added Imputed1, numeric
(NA
s were replaced with average values computed inside imputation strata obtained by crossing region
, nace.macro
, emp.cl
)
va.imp2
Value Added Imputed2, numeric
(NA
s were replaced with median values computed inside imputation strata obtained by crossing region
, nace.macro
, emp.cl
)
y
A numeric
variable correlated with va
weight
Direct weights, numeric
fpc
Finite Population Corrections (given as sampling fractions inside strata), numeric
ent
Convenience numeric
variable identically equal to 1
(sometimes useful, e.g. to estimate the total number of enterprises)
dom3
An unplanned estimation domain, factor
with 4
levels
The sbs.frame
sampling frame (from which sbs
units have been drawn) contains 17318 units.
data(sbs) head(sbs) str(sbs) str(sbs.frame)