1. ModelFlow and Pandas DataFrames#
Any class can have both properties (data) and methods (functions that operate on the data of the particular instance of the class). With object-oriented programming languages like python, classes can be built as supersets of existing classes. The ModelFlow class model inherits or encapsulates all of the features of the pandas DataFrame and extends it in many ways. Some of the methods below are standard pandas methods, others have been added to by ModelFlow features.
Data in a DataFrame can be modified directly with built-in pandas functionalities like .loc[] and eval() discussed in the previous chapter but ModelFlow extends these capabilities in important ways.
In this chapter - ModelFlow and Pandas DataFrames
This chapter explores the integration of DataFrames into ModelFlow and the extensions to standard pandas by ModelFlow that facilitate working with economic models.
Key points include:
PandasandModelFlow:DataFrames are central to organizing and manipulating model data.ModelFlowextendspandasfunctionality for time-series and macroeconomic analysis.
Key Features:
Column Names: Ensure consistent variable naming conventions in DataFrames.
Index and Time Dimensions: Use indexed DataFrames to handle time-based data effectively.
Leads and Lags: ModelFlow provides built-in methods to manage lead and lag relationships between variables.
Core Methods:
.upd(): Updates a DataFrame with new values or transformations for variables..mfcalc(): Calculates transformed variables based on the model’s equations.
1.1. Column names in ModelFlow#
While pandas DataFrames are very liberal in what names can be given to columns, ModelFlow is more restrictive.
Specifically, in ModelFlow a variable name must:
start with a letter
be upper case
```{note} ModelFlow variable names
ModelFlow places more restrictions on column names than do pandas per se.
ModelFlow variable names must start with a letter and be upper case.
Thus while all the below are legal column names in pandas, some are illegal in `ModelFlow`.
Variable Name |
Legal in modelfow? |
Reason |
|---|---|---|
IB |
Yes |
Starts with a letter and is uppercase |
ib |
No |
lowercase letters are not allowed |
42ANSWER |
No |
does not start with a letter |
_HORSE1 |
No |
does not start with a letter |
A_VERY_LONG_NAME_THAT_IS_LEGAL_3 |
Yes |
Starts with a letter and is uppercase |
1.2. .index and time dimensions in ModelFlow#
As we saw above, series have indices. DataFrames also have indices, which are the row names of the DataFrame.
In ModelFlow the index series is typically understood to represent a date.
For yearly models a list of integers like in the above example works fine.
For higher frequency models (quarterly, monthly, weekly,daily, etc.) the index can be one of several pandas date types, but users are encouraged to use pd.period_range() to create date indexes, because the ModelFlow reporting methods (tables and graphs) work well with indexes generated using this method.
Warning
Not all python datetypes work well with the graphics routines of ModelFlow. Users are advised to use the pd.period_range() method to generate date indexes for higher-frequency data (i.e. monthly or quarterly data).
For example:
dates = pd.period_range(start='1975q1',end='2125q4',freq='Q')
df.index=dates
1.3. Leads and lags#
Pandas does not support the economic idea of leads and lags per se (although the .shift() operator can be used to emulate the same idea in ordered DataFrames).
ModelFlow explicitly supports the idea of leads and lags. In ModelFlow leads and lags can be indicated by following the variable with a parenthesis and either -1 or -2 for one or two period lags (where the number following the negative sign indicates the number of time periods that are lagged). Positive numbers are used for forward leads (no +sign required).
When a method defined by the ModelFlow class encounters something like A(-1), it will take the value from the row above the current row. No matter if the index is an integer, a year, quarter or a millisecond. The same goes for leads, A(+1) will return the value of A in the next row.
As a result in a quarterly model B=A(-4) would assign B the value of A from the same quarter in the previous year.