What is DW?
A DW is : A collection of integrated , subject
oriented Db designed to support the DSS (decision support system) function,
where each unit of data is relevant to some moment in time.
A copy of TD specifically structured for query &
analysis.
Define InfoObject?
Busines s evaluation objects (customers. sales
revenues...) are known in BW as InfoObjects. They are divided into
characteristics, key figures, units and time characteristics.
Differentiate between InfoObject and
InfoSource?
InfoObject: Business evaluation objects
are called InfoObjects.
InfoSource: An InfoSource is always a
quantity of InfoObjects that logically belong together
What is Source system?
All
systems those are available to the Business Information Warehouse for data
extraction. This could be an R/3 system but could equally be an external system
like Oracle Applications, Siebel, XML, Flat file data.
What are the types of Source
Systems?
SAP R/3
Systems – From release 3.OD
External Source Systems (Non
SAP, Flat Files, Data providers, R/2, BW)
->How each source system
communicates to the BW system?
SAP R/3 OLTP and BW -> ALE
and/or TRFC
External systems -> BAPI
Define source system?
All systems that provide the SAP Business
Information Warehouse with data are indicated as source systems.(A SS is any
system that provides data for SAP BW)
What is an info source?
Ø The
Info Source prepares the data for the permanent storage with in Infocube.
Ø Supplies
BW with source system data that logically belongs together from a business point
of view
Ø Info
source is system specific
What is data
Staging?
The collection of extracted data from various source
systems and their homogenization is called Data Staging
List and explain the components of InfoSource?
An InfoSource is always a
quantity of InfoObjects that logically belong together. The structure in which
these are stored is called a communication structure. Data is updated in the
InfoCubes from this structure. In this way, the system always accesses the
active saved version of the communication structure
What are the types of data an
Info source contains?
Info Source contains either
transaction data or master data (Attributes, Texts and Hierarchies)
What is maximum number of dimensions that
one can have in an InfoCube?
16 including the 3 Dimensions provided by SAP i.e.
Unit, Time and InfoPackage
What are DIM ID’s and SID’s?
DIM ID means Dimension ID. SID means Set ID.
What does the BW statistics cube do?
BW statistics enables the users to get an overview
of the BW system such as how InfoCubes, InfoObjects, InfoSources, Source
Systems, Queries and Aggregates are used in the system.
Ø It offers support to review
the system performance.
Ø It helps the users to manage
the BW system.
Ø It assists the users to
maximize the usage and minimize the updating cost of Aggregates.
Differentiate between Data Staging and
Data Store.
Data Staging: The entire process of
controlling the data transfer from various data sources to the BIW can be
summarized under the generic term ‘Staging Engine’.
Data Store: BW makes a store available
for operative data. Here the transaction data from the OLTP can be saved in one
step. In the next step update of the InfoCube can take place. ODS enables:
Ø Check of imported data
Ø Data manipulation
Ø Synchronize loading from
different source systems (time lag)
Ø Refreshing InfoCube (no OLTP
traffic)
Initialization of InfoCubes from historical data (no
OLTP traffic)
Differentiate between Source System and an
InfoSource.
Source Systems: All systems that provide
the SAP Business Information Warehouse with data are indicated as source
systems.
InfoSource: An InfoSource is a
summarized quantity of information that logically belongs together for a unit.
An InfoSource is always a quantity of InfoObjects that logically belong
together. If fields, which logically belong together, exist in various source
systems, they can be brought together for an InfoSource in BW.
Differentiate between Extract Structure
and Transfer Structure.
Extract Structure: In the extract structure,
data from a DataSource is staged in the source system. The extract structure
contains the amount of fields that are offered by an extractor in the source
system for the data loading process.
Transfer Structure: The transfer structure is
the structure in which the data is transported from the source system into the
SAP Business Information Warehouse. It provides a selection of the extract
structure fields for the source system.
Note: When you activate the
transfer rules in BW, an identical transfer structure for BW is created in the
source system from the extractor structure
Why do you need transfer structures in
both OLTP and OLAP?
An InfoSource in BW requires at least one DataSource
for data extraction. DataSource data that logically belongs together is staged
in an extract structure in an SAP source system. In the transfer structure
maintenance, you determine which extract structure fields are transferred to
the BW. When you activate the transfer rules in BW, an identical transfer
structure for BW is created in the source system from the extractor structure.
This data is transferred 1:1 from the transfer structure of the source system
into the BW transfer structure, and is then transferred into the BW
communication structure using the transfer rules.
Differentiate between transfer rules and
update rules.
Transfer Rules: When you have maintained the
transfer structure and the communication structure, you use the transfer rules
to determine how the transfer structure fields are to be assigned to the
communication structure InfoObjects. You can arrange for a 1:1 assignment. But
you can also fill InfoObjects using routines or constants.
Update
Rules:
The update rules specify how the data (key figures, time characteristics,
characteristics) are updated in the InfoCube from the communication structure
of an InfoSource. You are therefore connecting an InfoSource with an InfoCube
How data extraction will be
executed?
SAP R/3 OLTP -> Using API’s (Application Programming
Interface)
External systems -> Using
third party tools or file system
How master data is assigned in
BW?
In BW master data is assigned to the Characteristics
and can be represented using Attributes, Texts and Hierarchies
How many source systems and
communication structures a Transfer structure can refer to?
One,
a Transfer Structure always refers to one Source system and one Communication
structure
How
OLTP Applications communicate with BW?
Source System |
Communication Procedure |
SAP |
Service API |
Oracle, SQL Server, Databases |
DB Connect |
ETL Tools e.g.: Informatica, Ascential |
BAPI |
Flat File |
File Interface |
XML |
XML Interface |
Where does Extract mechanisms that prepare productive data
per application for the BW exists?
Any OLTP System (Eg: SAP
R/3, Oracle, Informatica, Siebel, XML, Flat file, etc)
What
is the data warehouse Objectives?
a.
Extract data from different source systems is the
main functionality of data warehouse
b.
Manipulate data and generate reports, this process
is called Data modeling. This is also called OLAP Processing
c.
Make such reports available to the decision makers
(Info presentation)
What
is the Architecture of BW?
d.
BW has a multi- level architecture to provide the
maximum degree of flexibility
e.
BW can extract and use data provided by a variety of
sources. These include R/3 and R/2 systems, non-SAP systems, flat files, commercial
data providers, and even other BW systems.
f.
The BW server provides all the tools necessary to
model, extract, transform, aggregate, store, and access data. Since the
description of the data, regardless of its source, is contained in a common
Metadata Repository, data from a variety of sources can be combined to give you
enhanced data analysis options.
d. BW users can access data through the SAP Business
Explorer, a standard web
Browser, or third-party display tools, which have been certified for use with a
common BAPI interface.
Where Meta Data is managed Business Information Server?
It
is managed in a Meta data Repository in BW Server. Meta data gives the
information that characterizes the data in the data warehouse environment.
What is Staging Engine?
The
entire process of controlling data transfer from various source systems to the
BW can be summarized under the generic term Staging Engine.
Staging process involves following stages:
a.
Extract source structure is used to temporarily hold
data in the OLTP.
b.
Data is copied to transfer structure on the OLTP.
c.
Using ALE or TRFC data is transferred into BW.
d.
Transfer rules can be used for homogenization of
data.
e.
Update rules describe data update into Infocube
through communication structure.
What is the difference between Operational (OLTP)
Vs Decision support environments (OLAP)?
Characteristic
|
OLTP
|
OLAP
|
Primary Operation
|
Create, Delete, Update data (DML operations are performed) |
Analyze the data |
Level of Analysis
|
Low |
High |
Amount of Data per Transaction |
Very small |
Very large |
Type of Data |
Detailed |
Summary |
Relevance of Data |
Current |
Current and historical |
Data Updates |
Often |
Less frequent, only new data |
Database Concept |
Complex |
Simple |
Number of Transactions/Users |
Many |
Few |
Time Frame |
Point in time |
Time period |
Database Data |
Normalized |
Denormalized |
Number of Tables per Transaction |
Several |
Few |
Type of Processing |
Well-defined |
Ad hoc |
What is Data Staging?
ü
The staging process prepares data for the
permanent storage.
ü
The BW system can handle heterogeneous data
sources, SAP and non – SAP source systems.
ü
The Meta Data repository includes technical
and business related information bout the data in BW.
ü
Most important Meta Data information within
the staging process are described in Info Objects and Info Sources.
ü
The data staging for the Business Information
Warehouse is also possible from external systems and flat files.
What is Info-IDoc?
ü
Info-Idocs contain information about whether
or not the data Idocs were correctly uploaded.
ü
By specifying a frequency, you determine the
number of data Idocs after which an Info IDoc is to be sent. In other words,
how many data IDocs are described by a single Info-Idoc
ü
You can see in the BW Monitor from each info
IDoc whether the load process was successful or not. The traffic light is green
if this is the case for all data Idocs described by an InfoIDoc
What are the Meta Data Tables
prefix for OLTP (SAP R/3) and BW?
SAP R/3 –
RO* (Around 64 Tables)
BW – RS*
(Around 68 tables)
Metadata
is data about data. This means that Metadata describes the origin, history and
further aspects of the data. Using the Metadata, the information that is stored
in the SAP Business Information Warehouse can be effectively used for reporting
and analysis.
There are
different classes of Metadata:
Ø
Technical
Ø
Business-orientated
Explain about “Business Content” in SAP Business
Information Warehouse?
Predefined role and task-related information models
that can be suited to enterprise-specific requirements. Business Content makes
the supply of information available to roles in an enterprise, which requires
this in order to complete its tasks. Business Content essentially consists of
roles, workbooks, queries, InfoCubes, InfoObjects, InfoSources and update
rules, as well as extractors, for SAP R/3, SAP New Dimension Applications and
for additional selected applications.
All are available in the SAP delivery system, the
so-called D (Delivery Version) version.
If you want to work with
the Content objects, then you have to transfer them from the D (Delivery)
version to the A (Active) version.
Version
|
Explanation
|
D
|
SAP Content Delivery
Version
|
A
|
Active version for usage
|
M
|
Change Version
|
Why and How SID’s are shared
across InfoCubes?
SID Tables are nothing but
Set ID or Surrogate ID Tables. These tables contain Master Data, Texts and
Hierarchies for respective Characteristics in the Dimensions.
Since the Master data does
not change frequently, the Data is made shareable for all the InfoCubes
Internal key of type INT4
that you use for master data for master data-bearing characteristics,
especially for hierarchy nodes and for characteristic names. Master data IDs
and characteristic values are stored in master data tables (SID tables).
Information about time- independent or time-dependent master data, which is
stored in a P table or a Q table, is saved again in an X or Y table, using SIDs
instead of characteristic values.
Explain
briefly about the Staging process?
The entire process of
controlling the data transfer from various data sources to the BIW can be
summarized under the generic term “Staging Engine”. The main purpose of the
Staging process includes the collection of extracted data and their
homogenization.
This includes the following:
Ø Establishing
the connection between Source Systems and BW System.
Ø Meta
Data Maintenance.
Ø InfoObject
definition.
SAP has predefined 3 dimensions. What are they and why they were provided by
SAP?
The 3 dimensions
provided by SAP are Time, Unit and InfoPackage.
Ø Time:
The explicit time dimension is needed to describe Fiscal Periods, Seasons,
Holidays, Weekends and other calendar calculations that are difficult to get
from the SQL date machinery.
Ø Unit:
Multi currency conversion situations and also the quantity conversions can be
best handled as the system provides this dimension.
Ø InfoPackage:
Data is uploaded as an InfoPackage. The InfoPackage ID is generated by the
system. InfoPackage is primarily used for the purpose of scheduling and
rollback of data if and when required.
What is “Extended Star Schema” and how did it
emerge?
The
Star Schema consists of the Dimension Tables and the Fact Table. The Master
Data related tables are kept in separate tables, which has reference to the
characteristics in the dimension table(s). These separate tables for master
data is termed as the Extended Star Schema
List and explain the components of Administrator
Workbench (AWB).
The components of AWB are:
Ø Modeling:
In modeling, you can create and edit all the objects and rules of the
Administrator Workbench that are needed for data transport, update and
analysis.
Ø Monitoring:
In Monitoring, you have the option of overseeing and controlling the data
loading process into the Business Information Warehouse.
Ø Business
Content: Contains a repository of objects (InfoCubes, InfoSources, InfoObjects,
Queries etc). These can be made use of by activating the required objects. We
may use it as it is or make some modifications to suit our requirement.
Ø Metadata
Repository: In the Metadata Repository, you can display active objects in the
system (Activated Objects) and objects delivered by SAP (Business Content).
When & how Dimension tables in an InfoCube get
populated?
The Dimension tables gets
populated during the Transactional Data upload.
For every combination of the
SID’s in the dimension, a DIM ID is created
What are different types of data with respect to
BIW? In what sequence they are loaded
and why?
Meta Data, Master Data, Transactional Data.
ü First: Meta data: is uploaded
ü Second: Master Data is uploaded.
ü Third: Transaction Data is uploaded
How “Degenerate Dimensions” & “Sparsely
Populated Cube” are created? Explain
reasons why they are desired / not desired?
Degenerate Dimensions: Arises incase where the grain
of the Fact Table represents an actual working document like Order number or
Invoice number.
ØDesired: Such a situation is desired
in cases where the high level of granularity is required.
ØNot Desired: Since in cases where we
need to go in for detail, we can go to OLTP side of data. Also this level of
granularity increases the InfoCube size and affects the query performance.
Sparsely Populated Cube: In the Fact Table, when not
all fields are filled with data for each record inserted, the field(s), which
is not filled, also occupy space. Over a period of time a lot of disk space is
wasted which decreases the performance of reporting. This is situation is
called “Sparsely Populated Cube”.
Not Desired:
This situation is not
desired as there is unnecessary increase in the disk space
What are Aggregates?
How an Aggregate created? Under
what circumstances an Aggregate cannot be created?
Aggregates are the summarized data of an InfoCube.
Aggregates form a subset of an InfoCube.
ü Aggregates cannot be created
where the master data is time dependent.
Write briefly about various types of InfoObjects and
InfoSources and what are the differences between them?
InfoObjects: The different types of
InfoObjects are:
Ø Characteristics: The qualitative assessment
of data.
Ø Key Figures: The quantitative aspects of
data
Ø Time Characteristics: Provided by SAP. Basically
they are Characteristics.
Ø Units: Key Figures provided by
SAP. They are Units of measure.
Ø Technical Characteristic
Info Object: eg: Document number
InfoSources: The different types of InfoSources are:
Ø Master Data InfoSource: For uploading Master Data.
Ø Transaction Data InfoSource: For uploading Transaction
Data.
Give a sketch mapping Dimension Key in an InfoCube
from its origin. Ans ?
How would you decide whether an InfoObject should be
used well as a Dimensional Characteristic or as an attribute of a basic
Characteristic?
Dimensional Characteristic is basically a ‘strong’
entity. The other entities, which describe this basic characteristic, form the
attributes of the basic characteristic.
Eg: If Material Car is taken as a basic
characteristic, then the other attributes like the Color, Accessories etc form
the attributes of the basic characteristic Material Car
What is the single most important factor that
determine the performance of the InfoCube and Why?
Granularity. Higher the granularity, greater level of detail to
which the data exists in the Dimension and hence the Fact Table. This increases
the disk space and hence directly proportional to the performance.
When extracting data from OLTP to InfoCube, please
list out the areas where and how data can be manipulated?
Data can be manipulated in the following areas:
Ø From Extracts to the
Transfer Structure (at OLTP): Transfer Rules
Ø From Transfer Structure to
the Communication Structure (in OLAP): Transfer Rules.
Ø From Communication Structure
to InfoCube (in OLAP): Update Rules.
Why the size of Data Warehouse is always much bigger
than an OLTP environment?
The size of data in a DW is much more that in
OLTP as there are huge numbers of indexes.
“Normalization in Data Warehouse”. Write your comments on this statement.
The basic feature of Normalization is avoiding of
duplication of data through the use of Primary Key and Foreign Key
relationships between tables. However this concept slows the query performance.
To increase the query performance is the main motto of DW. The SID and Fact
Tables are highly normalized. However the Dimension Tables are highly
de-normalized.
Any
effort to normalize the dimension tables results in saving less than 1% of the
disk space in comparison to the Fact Table. Thus rendering the effort a waste
of time
Can we upload transactional data even when
master data is not available? Explain.
Yes.
However we cannot query for that particular record.
What is ODS and what is it used for?
An ODS Object serves to store consolidated and
debugged transaction data on a document level (atomic level). It describes a
consolidated dataset from one or more InfoSources. This dataset can be analyzed
with a BEx Query or InfoSet Query.
An
ODS Object contains a key (for example, document number/item) as well as data
fields that can also contain character fields (for example, order status,
customer) as key figures. The data of an ODS Object can be updated with a delta
update into InfoCubes and/or other ODS Objects in the same system or across
systems
What is the significance of ODS in BIW?
An
ODS Object serves to store consolidated and debugged transaction data on a
document level (atomic level). It describes a consolidated dataset from one or
more InfoSources. This dataset can be analyzed with a BEx Query or InfoSet
Query. The data of an ODS Object can be updated with a delta update into
InfoCubes and/or other ODS Objects in the same system or across systems.
In
contrast to multi-dimensional data storage with InfoCubes, the data in ODS
Objects is stored in transparent, flat database tables
Why does one need to maintain
aggregates? Under what conditions should
one not maintain aggregates?
Being
a subset of an InfoCube, Aggregates can be used for improving the query
performance. There are no aggregates for time-dependent characteristics and
their dependent navigational attributes.
We have to do a master data upload before
a transactional data upload, why?
During the Transactional Data Upload, the Dimension
Tables get populated. The DIM ID’s are created based on the SID’s. ie for
almost every characteristic there is a corresponding Master Data which has to
be referred during the transactional data upload.( get error missing DIM ID)
Under what reporting scenarios do we need
to make master data time dependent?.................
Efforts to normalize any of the tables in
a dimension in order to save space are a waste of time. Why?
Efforts to normalize any of the tables in a
dimension in order to save space are a waste of time as the space saved is less
than 1% of the size of the Fact Table.
In a fact less fact table please explain what are event tracking
tables and coverage tables giving a small example of each?
Events
are often modeled by a fact table containing a number of keys, each
representing a participating dimension in the event. Such event tables often
have no obvious numerical facts associated facts associated with them, and
hence are called factless fact tables.
Difference between analyzing data using a
LIS structure and an InfoCube (at least 6 points of difference)
|
|
Detailed transaction data
|
Often
summarized data
|
Current data with minimum history
|
Significant
history required
|
Changing with business events
|
Fixed
as of specific point in time
|
Minimally integrated with data from other modules
|
Significant
integration with data from other modules.
|
Highly normalized
|
Often
highly Denormalized ie restructures for queries
|
Update / Insert / delete
|
Read
only
|
45. What are the goals of data modeling
for a data warehouse?
The goals of data modeling for Data Warehousing are
ØA model that organizes the information according to
the primary business lines (eg. Product line, sales organization, time,) -- the dimensions.
ØA model that allows you to analyze the information
suing any combination of dimensions.
ØA model that allows you to define the aggregation
level and the returned information for a dimension without regard to other
dimensions.
List the decision points in designing a
dimensional data?
The processes and hence the identity of the fact
tables.
- The grain of each fact table
- The dimensions of each fact table
- The facts, including precalculated facts.
- The dimension attributes with complete descriptions and proper
terminology.
- How to track slowly changing dimensions.
- The aggregations, heterogeneous dimensions, mini dimensions, query
modes and other physical storage decisions.
- The historical duration of the database
- The urgency with which the data is extracted and loaded into the
data warehouse.
How would you decide whether an info
object should be used better as a dimensional characteristic or as an attribute
of a basic characteristic?
Dimension attributes that
will be frequently used for navigation (for drill down purpose) should be in
the dimension table.
A dependant parent dim.
Attribute should be placed in a master table (navigational attribute).
Differentiate between Meta data and Master
data.
Meta Data: Data that describes the
structure of data or MetaObjects is called Metadata. In other words data about
data is known as Meta Data.
Master Data: Master data is data
that remains unchanged over a long period of time. It contains information that
is always needed in the same way. Characteristics can bear master data in BW.
With master data you are dealing with attributes, texts or hierarchies.
Differentiate between Transaction data,
Master data and Metadata updates.
Data relating to the day-to-day transactions is the
Transaction data.
Differentiate between full upload and
delta upload.
A full update requests all data that corresponds
to the selection criteria you determined in the Scheduler.
A delta update requests only the data that
has been accumulated since the last update, and adds these to the database.
Difference between Infocube & ODS
Objects
Infocube
|
ODS
|
1.It is a multi dimensional object (MDM)
|
1.It is a flat transparent data table
|
2.Fact tables, dimension tables exist
|
2.No fact and dimension tables exist
|
3. Follows extended star schema
|
3.No star schema
|
4.It supports Addition and No Update
|
4. It supports Addition, no update and Overwrite
|
What is Query Jump Target?
It is used to run other queries, transactions and
ABAP reports from output of a query. Sometimes output of basic query will be
treated as input to the other report
By using RRI (Report Report interface) query to
query or other transactions is possible
Transaction Code: RSBBS
What is a calculated key figure??
It is
a formula consisting of basic restricted or other calculated key figure
available in the info provider stored in Meta data repository for reuse in
multiple queries
What is the purpose of Filter?
We
use filters to limit selection of data to a particular characteristic value or
set of values
What is Exceptions and Explain?
In
exception reporting we select and highlight objects that are in someway different
or critical.
Results
that fall outside a set of predetermined threshold values (exceptions) are
highlighted in color. This enables you to identify immediately any results that
deviate from expected result.
Exception
reporting allows you to determine the objects that are critical for a query,
both online and in background processing.
How do you perform currency translation in
BEx?
Currency
translation in the update provides option of translating data records from
source currency to target currency. Transaction Code: RRC1
What is the use of variable?
To
make our report dynamic we use Variable. Transaction Code: RSZV
What is Info set?
Info
sets are collections of key figures that have a logical relationship to each
other. They contain either key figures or sets. This allows you to generate
different levels of detail.
What is an Info structure?
Info
structures are the database tables in which period-based data of the operative
application is updated. They are comprised of characteristics suitable for summarization
(such as, Purchasing Organizations), and key figures (for example, Sales).
What are the types of ODS Objects and
explain?
An ODS is a subject-oriented, integrated, volatile,
current-valued data store containing only corporate detailed data.
The ODS is an intermediate storage area of operational data
for the data warehouse
Advantage of the overwrite capability of
the ODS for key f and non kf since it is not possible to overwrite the
data in infocubes.By default the ODS contains TD only.ODS is based on two
dimensional database stru the no of db table joins will be minimal &the
query will be executed more efficiently.
There
are 2 main types of ODS Objects:
1. Standard ODS Object
2. Transactional ODS Object
Data
is immediately available for reporting. The transactional ODS object is
different from the standard ODS object thru its architecture and the way in
which it contains data. It fulfills the requirement for data to be immediately
visible. This is necessary for example, SAP Strategic Enterprise Management (SEM)
or other external applications.
Use:
Transactional ODS objects allow data to be available quickly
What are the ways one can define Transfer rule? TR are
used to define how the fields of the T stru are assigned to the Infoobject of
the comm. Stru.TR a transfer stru and a comm. Stru (infosource).3 types of TR
1) field to field2)fixed value
3) Local T routine using ABAP
What is
maximum number of key figures or facts in a fact table? Ans 233(255-6-16char)
What is maximum number of characteristics in a dimension?
Ans 248
Define Remote Cube and Multi Cube and
procedure for creating them? What is the use of Remote and Multi Cube, where
exactly we will use them?........................................................
Why an Info cube contains 16 dimensions,
why not some other number?
A
dimension uses a key column in the fact table. In most databases, a table can
have a maximum of 16 key columns. Therefore, BW mandates that an Info cube can
have a maximum of 16 dimensions: three are reserved for Data Packet, Time and
Unit; the remaining 13 are left for us to use.
Which Control table is updated in the OLTP for BW delta extraction?
Table
name TMCBIW (Mention the Info structure S###).
No comments:
Post a Comment