The Second Stage Uncertainty Model

The Second Stage Uncertainty Model#

The second stage of uncertainty topic modeling represents a further refinement in the analysis, focusing on a more nuanced understanding of the themes related to uncertainty. This stage introduces a new set of priors and a reduction in the number of topics, aiming to capture the core aspects of uncertainty with greater precision.

Model Parameters#

Priors#

The model in this stage incorporates three specific priors, each targeting different facets of uncertainty:

Prior 0: Concentrates on terms that encapsulate the emotional and psychological dimensions of uncertainty, such as “uncertain,” “risk,” “uncertainty,” “fear,” “panic,” “concern,” and “shock.”
Prior 1: Focuses on terms that signify efforts to mitigate or address uncertainty, including “improve,” “strengthen,” “ensure,” and “enhance.”
Prior 2: Targets terms that reflect economic responses to uncertainty, such as “hike” and “cut.”

These priors are carefully crafted to guide the model towards a comprehensive understanding of uncertainty, encompassing emotional reactions, strategic responses, and economic implications.

Number of Topics#

The model is configured with k=5, reducing the number of topics from the previous stage. This reduction aligns with the goal of honing in on the most salient aspects of uncertainty, allowing for a more focused and interpretable analysis.

Configuration Details#

The configuration for this specialized topic modeling with uncertainty analysis is encapsulated in the following command:

!nbcpu +model=nbcpu-topic_uncertainty_filtered noop=1

Show code cell output Hide code cell output

## Command Line Interface for HyFI ##
{'about': {'authors': 'Young Joon Lee <entelecheia@hotmail.com>',
           'description': 'Quantifying Central Bank Policy Uncertainty in a '
                          'Highly Dollarized Economy: A Topic Modeling '
                          'Approach',
           'homepage': 'https://nbcpu.entelecheia.ai',
           'license': 'MIT',
           'name': 'Measuring Central Bank Policy Uncertainty'},
 'debug_mode': False,
 'dryrun': False,
 'hydra_log_dir': '/home/yjlee/.hyfi/logs/hydra',
 'ignore_warnings': True,
 'logging_level': 'WARNING',
 'model': {'_config_group_': '/model',
           '_config_name_': 'lda',
           '_target_': 'thematos.models.lda.LdaModel',
           'autosave': True,
           'batch': {'_config_group_': '/batch',
                     '_config_name_': '__init__',
                     'batch_name': 'model',
                     'batch_num': None,
                     'batch_num_auto': False,
                     'batch_root': 'workspace/topic',
                     'config_dirname': 'configs',
                     'config_json': 'config.json',
                     'config_yaml': 'config.yaml',
                     'device': 'cpu',
                     'num_devices': 1,
                     'num_workers': 1,
                     'output_extention': None,
                     'output_suffix': None,
                     'random_seed': False,
                     'resume_latest': False,
                     'resume_run': False,
                     'seed': -1,
                     'verbose': True},
           'batch_name': 'model',
           'coherence_metric_list': ['u_mass', 'c_uci', 'c_npmi', 'c_v'],
           'corpus': {'_config_group_': '/dataset',
                      '_config_name_': 'topic_corpus',
                      '_target_': 'thematos.datasets.corpus.Corpus',
                      'batch': {'_config_group_': '/batch',
                                '_config_name_': '__init__',
                                'batch_name': 'corpus',
                                'batch_num': None,
                                'batch_num_auto': False,
                                'batch_root': 'workspace/topic',
                                'config_dirname': 'configs',
                                'config_json': 'config.json',
                                'config_yaml': 'config.yaml',
                                'device': 'cpu',
                                'num_devices': 1,
                                'num_workers': 1,
                                'output_extention': None,
                                'output_suffix': None,
                                'random_seed': False,
                                'resume_latest': False,
                                'resume_run': False,
                                'seed': -1,
                                'verbose': True},
                      'batch_name': 'corpus',
                      'data_load': {'_target_': 'hyfi.utils.datasets.load.DSLoad.load_dataframe',
                                    'columns': None,
                                    'data_dir': None,
                                    'data_file': 'datasets/processed/topic_uncertainty_filtered/train.parquet',
                                    'filetype': None,
                                    'index_col': None,
                                    'verbose': False},
                      'id_col': 'id',
                      'module': None,
                      'ngramize': True,
                      'ngrams': {'_config_group_': '/ngrams',
                                 '_config_name_': 'tp_ngrams',
                                 '_target_': 'thematos.datasets.ngrams.NgramConfig',
                                 'delimiter': '_',
                                 'max_cand': 5000,
                                 'max_len': 3,
                                 'min_cf': 20,
                                 'min_df': 10,
                                 'min_score': 0.5,
                                 'normalized': True,
                                 'workers': 0},
                      'path': {'_config_name_': '__batch__',
                               'batch_name': 'corpus',
                               'task_name': 'topic',
                               'task_root': 'workspace'},
                      'pipelines': [],
                      'stopwords': {'_config_group_': '/stopwords',
                                    '_config_name_': '__init__',
                                    '_target_': 'lexikanon.stopwords.Stopwords',
                                    'lowercase': True,
                                    'name': 'stopwords',
                                    'nltk_stopwords_lang': None,
                                    'stopwords_fn': None,
                                    'stopwords_list': None,
                                    'stopwords_path': '/home/yjlee/.hyfi/logs/hydra/hyfi/2023-08-15/2023-08-15_19-45-05/tests/assets/stopwords/nbcpu-uncertainty_filtered.txt',
                                    'verbose': True},
                      'task_name': 'topic',
                      'task_root': 'workspace',
                      'text_col': 'tokens',
                      'timestamp_col': 'time',
                      'verbose': True,
                      'version': '0.0.0'},
           'eval_coherence': True,
           'model_args': {'_config_group_': '/model/config',
                          '_config_name_': 'lda',
                          '_target_': 'thematos.models.config.LdaConfig',
                          'alpha': 0.1,
                          'eta': 0.01,
                          'k': 10,
                          'min_cf': 500,
                          'min_df': 500,
                          'rm_top': 0,
                          'tw': 1},
           'model_type': 'LDA',
           'module': None,
           'path': {'_config_name_': '__batch__',
                    'batch_name': 'model',
                    'task_name': 'topic',
                    'task_root': 'workspace'},
           'pipelines': [],
           'save_full': True,
           'set_wordprior': True,
           'task_name': 'topic',
           'task_root': 'workspace',
           'train_args': {'_config_group_': '/model/train',
                          '_config_name_': 'topic',
                          '_target_': 'thematos.models.config.TrainConfig',
                          'burn_in': 0,
                          'interval': 10,
                          'iterations': 100},
           'train_summary_args': {'_config_group_': '/model/summary',
                                  '_config_name_': 'topic_train',
                                  '_target_': 'thematos.models.config.TrainSummaryConfig',
                                  'flush': False,
                                  'initial_hp': True,
                                  'params': True,
                                  'topic_word_top_n': 10},
           'verbose': True,
           'version': '0.0.0',
           'wc_args': {'_config_group_': '/model/plot',
                       '_config_name_': 'wordcloud',
                       '_target_': 'thematos.models.config.WordcloudConfig',
                       'dpi': 300,
                       'figsize': None,
                       'fontpath': None,
                       'height_multiple': 2,
                       'make_collage': True,
                       'mask_dir': None,
                       'num_cols': 5,
                       'num_images_per_page': 20,
                       'num_rows': None,
                       'output_file_format': 'wordcloud_p{page_num:02d}.png',
                       'save': True,
                       'title_color': 'green',
                       'title_fontsize': 14,
                       'titles': None,
                       'top_n': 500,
                       'wc': {'_config_group_': '/plot',
                              '_config_name_': 'wordcloud',
                              '_target_': 'thematos.plots.wordcloud.WordCloud',
                              'background_color': 'black',
                              'collocation_threshold': 30,
                              'collocations': True,
                              'color_func': None,
                              'colormap': 'PuBu',
                              'contour_color': 'steelblue',
                              'contour_width': 0,
                              'font_path': None,
                              'font_step': 1,
                              'height': 200,
                              'include_numbers': False,
                              'mask': None,
                              'max_font_size': None,
                              'max_words': 200,
                              'min_font_size': 4,
                              'min_word_length': 0,
                              'mode': 'RGB',
                              'normalize_plurals': True,
                              'prefer_horizontal': 0.9,
                              'regexp': None,
                              'relative_scaling': 'auto',
                              'repeat': False,
                              'scale': 1,
                              'stopwords': None,
                              'width': 400},
                       'width_multiple': 4},
           'wordprior': {'_config_group_': '/words',
                         '_config_name_': 'wordprior',
                         '_target_': 'thematos.models.prior.WordPrior',
                         'data_file': '/home/yjlee/.hyfi/logs/hydra/hyfi/2023-08-15/2023-08-15_19-45-05/tests/assets/words/word_prior_uncertainty_filtered.yaml',
                         'lowercase': True,
                         'max_prior_weight': 1.0,
                         'min_prior_weight': 0.01,
                         'prior_data': None,
                         'verbose': True}},
 'noop': 1,
 'resolve': True,
 'verbose': False,
 'version': '0.15.0'}

Dryrun is enabled, not running the HyFI config

Running the Workflow#

The entire workflow can be executed using the following command:

!nbcpu +workflow=nbcpu tasks='[nbcpu-topic_uncertainty_filtered]' mode=__info__

Show code cell output Hide code cell output

[2023-08-15 19:45:59,528][hyfi.joblib.joblib][INFO] - initialized batcher with <hyfi.joblib.batch.batcher.Batcher object at 0x7f010418ebe0>
[2023-08-15 19:45:59,529][hyfi.main.config][INFO] - HyFi project [nbcpu] initialized
[2023-08-15 19:45:59,718][hyfi.main.main][INFO] - The HyFI config is not instantiatable, running HyFI task with the config
[2023-08-15 19:46:00,546][hyfi.joblib.joblib][INFO] - initialized batcher with <hyfi.joblib.batch.batcher.Batcher object at 0x7f00e43217c0>
[2023-08-15 19:46:01,674][hyfi.task.batch][INFO] - Initalized batch: corpus(1) in /home/yjlee/workspace/projects/nbcpu/workspace/topic/corpus
[2023-08-15 19:46:03,099][hyfi.task.batch][INFO] - Initalized batch: corpus(1) in /home/yjlee/workspace/projects/nbcpu/workspace/topic/corpus
[2023-08-15 19:46:03,100][hyfi.task.batch][INFO] - Initalized batch: model(0) in /home/yjlee/workspace/projects/nbcpu/workspace/topic/model
[2023-08-15 19:46:03,805][hyfi.task.batch][INFO] - Initalized batch: corpus(1) in /home/yjlee/workspace/projects/nbcpu/workspace/topic/corpus
[2023-08-15 19:46:05,521][hyfi.batch.batch][INFO] - Setting seed to 3184339343
[2023-08-15 19:46:05,521][hyfi.batch.batch][INFO] - Init batch - Batch name: model, Batch num: 0
[2023-08-15 19:46:06,125][hyfi.batch.batch][INFO] - Setting seed to 3213887147
[2023-08-15 19:46:06,125][hyfi.batch.batch][INFO] - Init batch - Batch name: corpus, Batch num: 0
[2023-08-15 19:46:06,292][hyfi.batch.batch][INFO] - Init batch - Batch name: corpus, Batch num: 1
[2023-08-15 19:46:06,292][hyfi.task.batch][INFO] - Initalized batch: corpus(1) in /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/corpus
[2023-08-15 19:46:06,293][hyfi.batch.batch][INFO] - Init batch - Batch name: model, Batch num: 4
[2023-08-15 19:46:06,293][hyfi.task.batch][INFO] - Initalized batch: model(4) in /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model
[2023-08-15 19:46:07,065][hyfi.task.batch][INFO] - Initalized batch: corpus(1) in /home/yjlee/workspace/projects/nbcpu/workspace/topic/corpus
[2023-08-15 19:46:07,065][hyfi.task.batch][INFO] - Initalized batch: runner(1) in /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/runner
[2023-08-15 19:46:07,065][hyfi.workflow.workflow][INFO] - Running task [nbcpu-topic_uncertainty_filtered] with [run={} verbose=False uses='nbcpu-topic_uncertainty_filtered']
[2023-08-15 19:46:07,066][hyfi.task.batch][INFO] - > Loading config for batch_name: model batch_num: -1
[2023-08-15 19:46:07,066][hyfi.task.batch][INFO] - Loading config from /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/configs/model(2)_config.yaml
[2023-08-15 19:46:07,096][hyfi.task.batch][INFO] - Merging config with the loaded config
[2023-08-15 19:46:07,125][hyfi.task.batch][INFO] - Updating config with config_kwargs: {}
[2023-08-15 19:46:08,623][hyfi.task.batch][INFO] - Initalized batch: corpus(0) in /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/corpus
[2023-08-15 19:46:08,624][hyfi.task.batch][INFO] - Initalized batch: model(2) in /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model
[2023-08-15 19:46:08,637][thematos.models.lda][INFO] - Model loaded from /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/models/LDA_model(2)_k(5).mdl
[2023-08-15 19:46:08,638][hyfi.utils.iolibs][INFO] - Processing [1] files from ['/home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/LDA_model(2)_k(5)-ll_per_word.csv']
[2023-08-15 19:46:08,638][hyfi.utils.datasets.load][INFO] - Loading data from /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/LDA_model(2)_k(5)-ll_per_word.csv
[2023-08-15 19:46:08,640][hyfi.utils.datasets.load][INFO] -  >> elapsed time to load data: 0:00:00.002268
[2023-08-15 19:46:08,641][hyfi.utils.iolibs][INFO] - Processing [1] files from ['/home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/LDA_model(2)_k(5)-doc_topic_dists.parquet']
[2023-08-15 19:46:08,641][hyfi.utils.datasets.load][INFO] - Loading data from /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/LDA_model(2)_k(5)-doc_topic_dists.parquet
[2023-08-15 19:46:08,739][hyfi.utils.datasets.load][INFO] -  >> elapsed time to load data: 0:00:00.097628
[2023-08-15 19:46:08,739][hyfi.utils.iolibs][INFO] - Processing [1] files from ['/home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/LDA_model(2)_k(5)-topic_term_dists.parquet']
[2023-08-15 19:46:08,740][hyfi.utils.datasets.load][INFO] - Loading data from /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/LDA_model(2)_k(5)-topic_term_dists.parquet
[2023-08-15 19:46:08,761][hyfi.utils.datasets.load][INFO] -  >> elapsed time to load data: 0:00:00.021842
[2023-08-15 19:46:08,762][thematos.datasets.corpus][INFO] - Loading corpus...
[2023-08-15 19:46:08,762][thematos.datasets.corpus][INFO] - Processing documents in the column 'tokens'...
[2023-08-15 19:46:43,286][thematos.datasets.corpus][INFO] - Total 27594 documents are loaded.
[2023-08-15 19:47:02,935][hyfi.utils.datasets.save][INFO] - Saving dataframe to /home/yjlee/workspace/projects/nbcpu/workspace/topic/corpus/corpus_doc_ids.parquet
[2023-08-15 19:47:02,958][hyfi.composer.config][INFO] - Saving config to /home/yjlee/workspace/projects/nbcpu/workspace/topic/corpus/configs/corpus(1)_config.json
[2023-08-15 19:47:02,959][hyfi.composer.config][INFO] - Saving config to /home/yjlee/workspace/projects/nbcpu/workspace/topic/corpus/configs/corpus(1)_config.yaml
[2023-08-15 19:47:04,126][thematos.models.base][INFO] - Number of documents inferred: 27594
[2023-08-15 19:47:04,176][thematos.models.base][INFO] - Inferred topics:
          id    topic0    topic1    topic2    topic3    topic4  log_likelihood
0  501330943  0.000177  0.000333  0.000149  0.999217  0.000125     -451.052487
1  501326169  0.001222  0.995933  0.001027  0.000954  0.000864      -63.390180
2  501325554  0.067655  0.932007  0.000122  0.000113  0.000103     -557.345116
3  501322478  0.000243  0.999190  0.000205  0.000190  0.000172     -315.419008
4  501321914  0.999414  0.000262  0.000117  0.000109  0.000098     -569.363809
[2023-08-15 19:47:04,188][hyfi.utils.datasets.save][INFO] - Saving dataframe to /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/inferred_topics/LDA_model(2)_k(5)-inferred_doc_topic_dists.parquet
[2023-08-15 19:47:04,276][hyfi.utils.datasets.save][INFO] -  >> elapsed time to save data: 0:00:00.087966
[2023-08-15 19:47:04,277][thematos.models.base][INFO] - Inferred topics saved to /home/yjlee/workspace/projects/nbcpu/workspace/nbcpu-topic_uncertainty_filtered/model/outputs/inferred_topics/LDA_model(2)_k(5)-inferred_doc_topic_dists.parquet

Model Results of the Second Stage Uncertainty Model#

The second stage uncertainty model has been applied to a refined corpus of 6,917 documents and 259,912 words, utilizing 233 out of 46,850 total vocabs. The model’s configuration reflects the specific focus on uncertainty, with 5 topics and tailored hyperparameters.

Topics and Interpretation#

The topics generated by the model align with the priors and provide insights into different facets of uncertainty:

Topic #0: This topic captures the dynamics of risk and uncertainty in economic recovery. Terms like “risk,” “uncertainty,” “slowdown,” and “concern” reflect the challenges and uncertainties in economic growth and asset management.
Topic #1: Focused on strategies to mitigate uncertainty, this topic includes terms such as “improve,” “ensure,” “strengthen,” and “collaboration.” It represents efforts to enhance technical capabilities, establish strategic collaborations, and achieve relevant goals.
Topic #2: Reflecting economic responses to uncertainty, this topic includes terms like “cut,” “hike,” “fell,” and “fear.” It captures the volatility and concerns related to financial markets and economic indicators.
Topic #3: This topic emphasizes reform and goal-oriented strategies to address uncertainty. Terms like “reform,” “reduce,” “target,” and “improve” signify a focus on long-term planning and recovery.
Topic #4: Concentrating on socio-economic aspects, this topic includes terms like “poor,” “income,” “vulnerable,” and “efficiency.” It reflects the challenges and strategies related to poverty reduction, income enhancement, and resilience building.

Fig. 7 shows the wordcloud of the top 500 words in each topic from the LDA model with 5 topics and uncertainty prior.

../../_images/LDA_model%282%29_k%285%29_wordcloud_00.png — Fig. 7 Wordcloud of the top 500 words in each topic from the LDA model with 5 topics and uncertainty prior.#

Quantifying Central Bank Policy Uncertainty#

In the research aimed at quantifying Central Bank Policy Uncertainty, the second stage uncertainty model has yielded insights into different aspects and stages of uncertainty, particularly focusing on topics 0, 1, and 2. These topics collectively encapsulate the complexity of central bank policy uncertainty.

Topic 0: Early Stage of Uncertainty#

Interpretation: This topic captures the early indicators and dynamics of risk and uncertainty, reflecting terms like “risk,” “recovery,” “slow,” and “uncertainty.”
Relevance: It represents the early stage of uncertainty where economic challenges and asset management concerns begin to emerge, setting the stage for potential central bank interventions.

Topic 1: Reactive Measures#

Interpretation: Focused on mitigation strategies, this topic includes terms such as “improve,” “ensure,” “strengthen,” and “framework.”
Relevance: It signifies the reactive measures that might be employed to enhance technical capabilities, establish strategic collaborations, and achieve relevant goals in response to unfolding uncertainties.

Topic 2: Central Bank Actions#

Interpretation: Reflecting direct economic responses, this topic includes terms like “cut,” “hike,” “fell,” and “fear.”
Relevance: It captures the actions of the central bank, such as interest rate cuts or hikes, and the associated market reactions and fears. This topic is directly related to the central bank’s policy decisions in the face of economic uncertainty.

The three relevant topics provide a comprehensive view of central bank policy uncertainty, delineating the early stage of uncertainty, the reactive measures, and the specific actions of the central bank. They may represent different stages of uncertainty, each dealing with various facets of central bank policy. This nuanced understanding enhances the ability to quantify and analyze central bank policy uncertainty, offering valuable insights for policymakers, economists, and financial analysts. The model’s coherence and alignment with the research objectives underscore its effectiveness in exploring the multifaceted nature of uncertainty within the context of central banking.