Description
By comparing the category of articles of China and US in different periods, the aim is to compare which areas are paid more attention to by China and US with regard to nuclear reactor issue. Articles were classified as 5 types including politics/economy/technology/environment/health.
There is a difference between the categories of reporting nuclear reactor articles. It can be found that influenced by Chernobyl disaster 30 anniversary, in April 2016 articles of both China and US were mainly category of environment, the second category of American news reported was the economy, while in China was technology. In February 2017, for "New fuel leaks were discovered at Fukushima" affair, due to geographic proximity, China and Japan are neighboring countries, China news media especially paid attention to this event and most articles were environment category; Instead, most of the articles published by US media were economy category because of bankruptcy declared by Westhouse and Toshiba during this period. US news media also published much political news with regard to nuclear reactor issue, especially focused on Iran / North Korea nuclear issue.
Protocol
Base on the news of screened top 10 news providers per country, we used the Semantic Labeling Tool of Aylien to help us to classify the category of news. We first used Google translate to translate the Chinese news text into English, then classified them by using Aylien tool according to English translation. For American news, we inserted the URL in Semantic Labeling Tool to analyze the article category. The result was collected into excel form and visualized the data by using rawgraphs.io.
Data
Data source: Google news , Baidu News
Aylien
Download data, Download data(Category)
The collected data was organized into one dataset
1_USA 2016-April - 2_USA 2017-February - 3_CHINA 2016-April - 4_CHINA 2017-February
For each of the four periods of Chinese news and American news there is a dataset containing all the downloaded articles with Web Scraper (link, title, text, time, provider, category, comment, image )