The Employment of Text Analytics in Pacific Islands Countries
November 1st, 2023
Economic Policy Uncertainty¶
- Baker, Bloom, and Davis (2016) published an article on QJE, named "Measuring Economic Policy Uncertainty", which has been cited more than 9800 times on Google Scholar.
- They found that "innovations in policy uncertainty foreshadow declines in investment, output, and employment in the United States and, in a panel vector autoregressive setting, for 12 major economies."
- The following newspapers are used in creating the EPU Index:
- Solomon Islands: Solomon Stars, Solomon Times, The Island Sun, Solomon Islands Broadcasting Corporation, ABC AU, RNZ.
- Papua New Guinea: Post Courier, ABC AU, RNZ.
display(Image(data=sib_epu, width=800))
display(Image(png_epu, width=800))
How to calculate the EPU index
- Define three buckets of words;
- Economic: economy/economic/economics/business/commerce/finance/financial/industry
- Policy: government/governmental/authorities/minister/ministry/parliament/parliamentary/tax/regulation/legislation/central bank/cbsi/imf/world bank/international monetary fund/debt
- Uncertainty: uncertain/uncertainty/uncertainties/unknown/unstable/unsure/undetermined/risk/risky/not certain/non-reliable
- Find the scaled count of EPU news ($X_{it} = \frac{EPU news}{All news}$) for newspaper $i$ for time $t$;
- Get the standard deviation $\sigma_i$ for each newspaper $i$ at time $T_1$;
- Standardize $X_{it}$ by dividing $\sigma_i$ for all time $t$, giving a $Y_{it}$;
- Compute the mean over newspapers of $Y_{it}$ in each month to obtain the series $Z_t$;
- Compute $M$, the mean value of $Z_t$ for $T_1$;
- Multiply $Z_t$ by (100/$M$) for all t to obtain the normalized EPU time-series index;
Topic Modeling¶
Latent dirichlet allocation (LDA) is a frequently employed methods for topic modeling.
- We trained LDA model on the above-mentioned local newspapers in Solomon Islands and Papua New Guinea.
- The right panel displays the identified 16 topics in Papua New Guinea. For example:
- Topic 10 is about business, specifically mentioned coffee, cocoa, and tourism.
- Topic 5 mentioned assistance, training, and knowledge.
display(HTML(png_topic))
Sentiment Analysis¶
Sentiment Analysis
- Sentiment Analysis is frequently employed in natural language processing;
- In gist, it calculates a score from -1 to +1, reflecting not only the direction but also the intensity of emotions embedded in a corpus.
- Below uses the above-mentioned four newspapers:
- Calculate the sentiment compound score for each news and get a monthly average, $X_{it}$ for each newspaper $i$ at time $t$;
- Get an average over four newspapers, yielding a time-series sentiment score $S_t$
sib_sentiment
WorkFlow
Economic Related Sentiment in Papua New Guinea
display(Image(png_econ_seniment, width=800))
Economic Related Sentiment in Solomon Islands
display(Image(sib_econ_sentiment, width=800))
Google Trends¶
Google Trends enable us to get a list of top topics that were searched along with the requested term in a specified geographical locations.
For example, if one search "jobs in lae" in Papua New Guinea, the related search terms include: lae biscuit co. Ltd, mining, nestle, kina bank limited, Coca-Cola, and brewery.
The term "job in port moresby" yields hospitality, Waigani, officer, hilton, Stanley hotel & suites, etc.
Related Search Graph for Jobs
display(HTML(png_job_network))