Product Development
This page is dedicated to providing information relating to product development and instructions
to use online applications. Specific features in various datasets that are incorporated in the
product development process are also provided. Any other material to assist users to understand
datasets, difinitions and to optimise analysis of given data, forms part of the the section.
Methods
Product development process
Data products for publication are prepared following the ETL process. The survey data or
administrative data are extracted from databases that have completed the entire processing.
Such datasets are used in analysis and report writing. Datasets are processsed following the
standards prescribed in the fundamental principles of official statistics and the South African
statistical quality assessment framework.
The publishable data items are extracted from the source databases into dissemination databases.
Various tools including SAS and SQL are used to trans...form and stardardise data for dissemination.
The process does not invlove modification of datasets and any inconsistencies identified are investigated
and modified in the source database. The feedback process continues until the final datasets are loaded
into dissemination platforms for public access.
Database design
Database design for publication is implemented based on the relational database
model. Databases are consist of a collection of facts and classifications linked through the
relational databases principles. All the facts and classifications are also detailed in the metadata
document for the particular data. The metadata document is based on the SDMX standard.
Boundary alignment
The South Afican dissemination geography is linked to the provicial, district municipality, local municipality
and wards. When the boundaries change official statistics products are adjusted accordingly. All census
products are adjusted to provide estimates for based on the current boundaries. With regard to the administrative
data sources and surveys the alignment takes places at the source. In sample surveys the changes are reflected
in the revision of the master sample.
Survey reweighting
After every census the population estimates are revised to be based on the latest
information. All sample surveys use population estimates to compute final survey weights. The surveys
are them revised based on the revised estimates and are republished.
Time-series index files
The economic time-series are prepared based on a standard structure. The structure
is also part of metadata. The columns and cells are also described based on the data
items for the specific survey. The description of index file metadata is available
here.
Manuals
SuperSTAR Manual containing guide to SuperCROSS and SuperWEB2 is available
here.