# A Paper A Day

I was inspired by the story of This scientist read a paper every day for 899 days. Here’s what she learned , so I decided to try it myself: Dedicating half an hour everyday to reading papers. Let’s see for how long I can keep doing it.

I thought about giving APAD up in September - October of 2022. So I did not read papers regularly from 09/13 to 10/21. I also did not read papers from 05/31 to 06/06, and also on 2022-06-08. I will slowly make them up.

2020: 2020-12 , 2020-11 , 2020-10 , 2020-09

## 2022-11-09 #

1. De Vocht, L., Softic, S., Dimou, A., Verborgh, R., Mannens, E., Ebner, M., & Van de Walle, R. (2015, May). Visualizing collaborations and online social interactions at scientific conferences for scholarly networking . In Proceedings of the 24th International Conference on World Wide Web (pp. 1053-1054).

The authors designed an interactive visualization that displays collaboration and online interactions among scholars.

1. Mashhadi, A., Zolyomi, A., & Quedado, J. (2022, April). A Case Study of Integrating Fairness Visualization Tools in Machine Learning Education . In CHI Conference on Human Factors in Computing Systems Extended Abstracts (pp. 1-7).

In this paper, the authors used six open-source fairness visualization tools to study how these tools help students understand algorithm biases.

1. Moere, A. V., & Purchase, H. (2011). On the role of design in information visualization . Information Visualization, 10(4), 356-371.

This is a very interesting article. The authors argued that information visualizations have “utility”, “soundness”, and “attractiveness”. Academics tend to focus on the former two and overlook “attractiveness”. This paper argues that “attractiveness” should be considered as a viable dimension in teaching, evaluating, conducting vis research.

This paper made me wonder: What makes a scientific paper look good?

## 2022-11-08 #

1. Chakrabarti, A., Ahmad, F., & Quix, C. (2021). Towards a Rule-based Visualization Recommendation System . In KDIR (pp. 57-68).

I like this paper. The authors proposed a visualization recommendation system that is based on rules. Specifically, these rules are based on data abstraction and task abstraction. After identifying the data that users have and the tasks users want to perform, the system will recommend appropriate visualizations.

• Distribution

• Extrema
• Mean/median/mode
• range
• characterize distribution
• Part-to-Whole

• Categorical filter
• Categorical analysis
• Change over time

• Trend
• Sort
• Comparison

• Sort
• Filter
• Trend
• Relationship

• Retrieve value
• Cluster
• Correlation
• Anomalies
1. Valiati, E. R., Pimenta, M. S., & Freitas, C. M. (2006, May). A taxonomy of tasks for guiding the evaluation of multidimensional visualizations . In Proceedings of the 2006 AVI workshop on Beyond time and errors: novel evaluation methods for information visualization (pp. 1-6).

The authors come up with a list of tasks that are related to multidimensional data. This list is based on previous task taxonomies and also the tasks users performed during exploring a multidimensional data set.

## 2022-11-07 #

1. Chen, X., Lo, L. Y. H., & Qu, H. (2020). SirenLess: reveal the intention behind news . arXiv preprint arXiv:2001.02731.

In this paper, the authors designed a dashboard to visualize the linguistic features of news articles. The aim is to identify misleading news pieces.

1. Voigt, H., Alaçam, Ö., Meuschke, M., Lawonn, K., & Zarrieß, S. (2022, July). The Why and The How: A Survey on Natural Language Interaction in Visualization. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 348-374).

This paper reviews research that uses nlp in visualization. Specifically, they grouped these papers based on the tasks involved. These tasks are:

• Presentation

• visual storytelling
• explanation generation
• Discover

• keyword search
• querying
• browsing
• Enjoy

• augmentation
• vis description generation
• Produce

• annotation
• documentation
• visualization creation

These tasks are about why users interact with visualizations using natural language. Then the authors explored how natural language is used.

1. Heer, J., Bostock, M., & Ogievetsky, V. (2010). A tour through the visualization zoo. Communications of the ACM, 53(6), 59-67.

This is a must read in visualization. The authors introduced some of the cool new visualization techniques:

• stacked graph
• small multiples
• parallel coordinates
• intended tree layout
• tree layout
• sunburst
• network
• nested circles
• arc diagram

## 2022-11-06 (Completed on 2022-11-07) #

1. Sacha, D., Sedlmair, M., Zhang, L., Lee, J. A., Peltonen, J., Weiskopf, D., … & Keim, D. A. (2017). What you see is what you can change: Human-centered machine learning by interactive visualization. Neurocomputing, 268, 164-175.

This paper proposes a conceptual framework of how visualization can be integrated into machine learning. This conceptual pipeline consists of five steps:

• Data
• Preprocessing, e.g., transformation, weights
• Machine learning model selection
• Visualization
• Analysis (Execution and Evaluation)
1. Hu, K., Bakker, M. A., Li, S., Kraska, T., & Hidalgo, C. (2019, May). Vizml: A machine learning approach to visualization recommendation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-12).

This is a really cool paper. The authors analyzed one million visualizations. They trained a deep neural network on dataset-visualization pairs. The model achieved high accuracy in terms of predicting the appropriate visualization. This VizML approach can be used as a visualization recommendation tool.

## 2022-11-05 (Completed on 2022-11-06) #

1. Bui, A. A., Aberle, D. R., & Kangarloo, H. (2007). TimeLine: visualizing integrated patient records. IEEE Transactions on Information Technology in Biomedicine, 11(4), 462-473.

I like this study. The authors designed a visualization system to present patients' medical records. I believe it can be widely used in healthcare and can improve people’s lives.

1. Text Visualization Techniques: Taxonomy, Visual Survey, and Community Insights

I like this paper so much. The authors reviewed many text visualization papers and analyzed the techniques used and tasks involved. The authors even analyzed the co-authorship, which is really cool! Most importantly, they designed a really cool interactive system: https://textvis.lnu.se/

## 2022-11-04 (Completed on 2022-11-05) #

1. Subramanian, S. S., Pushparaj, P., Liu, Z., & Lu, A. (2019, October). Explainable Visualization of Collaborative Vandal Behaviors in Wikipedia. In 2019 IEEE Symposium on Visualization for Cyber Security (VizSec) (pp. 1-5). IEEE.

1. Narechania, A., Karduni, A., Wesslen, R., & Wall, E. (2021). VITALITY: Promoting Serendipitous Discovery of Academic Literature with Transformers & Visual Analytics . IEEE Transactions on Visualization and Computer Graphics, 28(1), 486-496.

This is a cool project. The authors created a visualization system that helps scholars find related papers. One drawback I found in this system is that it is too complicated.

## 2022-11-03 (Completed on 2022-11-04) #

1. Chinchilla-Rodríguez, Z., Vargas-Quesada, B., Hassan-Montero, Y., González-Molina, A., & Moya-Anegóna, F. (2010). New approach to the visualization of international scientific collaboration. Information visualization, 9(4), 277-287.

The authors proposed a new method to visualize international academic collaboration.

1. Li, J., Chen, X., Hovy, E., & Jurafsky, D. (2015). Visualizing and understanding neural models in nlp. arXiv preprint arXiv:1506.01066.

I couldn’t fully understand the paper but I know it is using visualization to understand neural language models.

1. Sievert, C., & Shirley, K. (2014, June). LDAvis: A method for visualizing and interpreting topics . In Proceedings of the workshop on interactive language learning, visualization, and interfaces (pp. 63-70).

This is another visualization tool that amplifies research. It aids people to do topic modeling.

## 2022-11-02 (Completed on 2022-11-04) #

1. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255-260.

This paper is a brief introduction to the developments in machine learning.

1. Vig, J. (2019). A multiscale visualization of attention in the transformer model . arXiv preprint arXiv:1906.05714.

This is really cool work. The author created a visualization tool (to be used with Jupyter Notebook) for language models.

## 2022-11-01 (Completed on 2022-11-03) #

1. Ralph, P. (2018). Toward methodological guidelines for process theories and taxonomies in software engineering. IEEE Transactions on Software Engineering, 45(7), 712-735.

To be honest, I was not able to fully comprehend this paper. It is too abstract for me. I know it is talking about theories for software engineering.

# 2022-10 #

## 2022-10-31 (Completed on 2022-11-02) #

I read news on science.org. I read about the space travel that scientists are pushing NASA to work on, and also the ancient Maya stargazers.

## 2022-10-30 (Completed on 2022-10-31) #

1. Lerman, K., Yu, Y., Morstatter, F., & Pujara, J. (2022). Gendered citation patterns among the scientific elite. Proceedings of the National Academy of Sciences, 119(40), e2206070119.

I really like this paper. The authors analyzed the gender disparities in NAS members. They find that there are significant gender differences in the citation networks among these NAS members. These differences are so strong that we can predict a member’s gender based on the citation network. There are no significant structural differences in the citation network based on the prestige of the affiliations that members work in.\

## 2022-10-29 #

Gleicher, M., Albers, D., Walker, R., Jusufi, I., Hansen, C. D., & Roberts, J. C. (2011). Visual comparison for information visualization. Information Visualization, 10(4), 289-309.

This paper explores methods for visual comparisons: juxtaposition, superposition, and explicit representation.

## 2022-10-28 #

1. Chen, X., Li, Z., Setlur, S., & Xu, W. (2022). Exploring racial and gender disparities in voice biometrics. Scientific Reports, 12(1), 1-12.

This paper talks about neural network enabled voice biometrics. It shows that there are racial and gender differences in accuracy.

1. Wapman, K. H., Zhang, S., Clauset, A., & Larremore, D. B. (2022). Quantifying hierarchy and dynamics in US faculty hiring and retention. Nature, 610(7930), 120-127.

This is a cool project. The authors analyzed the US faculty hiring. The study finds that a few US universities produced a lot of faculty members. It also finds that self-hiring is more prevalent than we previously thought.

## 2022-10-27 #

1. Wang, Y., Peng, T. Q., Lu, H., Wang, H., Xie, X., Qu, H., & Wu, Y. (2021). Seek for success: a visualization approach for understanding the dynamics of academic careers. IEEE Transactions on Visualization and Computer Graphics, 28(1), 475-485.

This is a cool paper. The authors designed a visualization system to show scholars' academic careers.

1. Zhang, Y., Sun, Y., Gaggiano, J. D., Kumar, N., Andris, C., & Parker, A. G. (2022). Visualization Design Practices in a Crisis: Behind the Scenes with COVID-19 Dashboard Creators. IEEE Transactions on Visualization and Computer Graphics.

This paper talks about how designers designed the COVID-19 dashboards. The finding is that the design is shaped by a lot of factors: public engagement, policy, tools, etc.

1. Araújo, T., Chagas, P., Alves, J., Santos, C., Sousa Santos, B., & Serique Meiguins, B. (2020). A real-world approach on the problem of chart recognition using classification, detection and perspective correction. Sensors, 20(16), 4370.

This is a cool project as well. They used neural networks to detect and classify charts in real-world settings like textbooks.

1. Li, R., & Chen, J. (2018, October). Toward a deep understanding of what makes a scientific visualization memorable . In 2018 IEEE Scientific Visualization Conference (SciVis) (pp. 26-31). IEEE.

This paper ran experiments to see what makes a scientific visualization more memorable than others.

1. Guo, Z., Tao, J., Chen, S., Chawla, N., & Wang, C. (2022). SD^ 2: Slicing and Dicing Scholarly Data for Interactive Evaluation of Academic Performance . IEEE Transactions on Visualization and Computer Graphics.

This paper designed a system to show scholars’ academic performance.

## 2022-10-26 #

1. Wongsuphasawat, K., Smilkov, D., Wexler, J., Wilson, J., Mane, D., Fritz, D., … & Wattenberg, M. (2017). Visualizing dataflow graphs of deep learning models in tensorflow . IEEE transactions on visualization and computer graphics, 24(1), 1-12.

I could not fully understand this paper. I know it is visualizing a convolutional neural network.

This is indeed impressive. I do not know what will happen if SpaceX can launch Starship daily with such a low price.

## 2022-10-25 #

Andry, T., Hurter, C., Lambotte, F., Fastrez, P., & Telea, A. (2021, May). Interpreting the Effect of Embellishment on Chart Visualizations . In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-15).

This is an interesting study. The authors analyzed the effect of embellishment on charts. The finding is that it has positive effects.

## 2022-10-24 #

1. Wang, Q., Chen, Z., Wang, Y., & Qu, H. (2021). A Survey on ML4VIS: Applying MachineLearning Advances to Data Visualization . IEEE Transactions on Visualization and Computer Graphics.

This is a survey paper on the application of ML in VIS. It aims to answer two questions: which VIS processes can be assisted by ML, and how ML can help solve VIS problems.

It identifies seven VIS processes that can be assisted by VIS: Data Processing4VIS, Data-VIS mapping, insight communication, style imitation, VIS interaction, VIS rendering, and User profiling.

1. Shen, L., Shen, E., Luo, Y., Yang, X., Hu, X., Zhang, X., … & Wang, J. (2021). Towards natural language interfaces for data visualization: A survey . arXiv preprint arXiv:2109.03506.

This is a survey paper about how to use natural language interfaces in visualization.

1. Wang, Y., Hou, Z., Shen, L., Wu, T., Wang, J., Huang, H., … & Zhang, D. (2022). Towards Natural Language-Based Visualization Authoring . IEEE Transactions on Visualization and Computer Graphics.

In this paper, the authors developed natural language based visualization interfaces, i.e., people can ask machines to make visualizations based on their natural language inputs.

## 2022-10-23 #

1. Henry, N., Goodell, H., Elmqvist, N., & Fekete, J. D. (2007). 20 years of four HCI conferences: A visual exploration. International Journal of Human-Computer Interaction, 23(3), 239-285.

The authors did a comprehensive scientometric analysis of HCI conferences: CHI, InfoVis, UIST, and AVI.

1. Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z., … & Tong, X. (2011). Textflow: Towards better understanding of evolving topics in text. IEEE transactions on visualization and computer graphics, 17(12), 2412-2421.

This paper designed a method to visualize the evolution of topics.

1. Ngo, Q. Q., Dennig, F. L., Keim, D. A., & Sedlmair, M. (2022). Machine learning meets visualization–Experiences and lessons learned . it-Information Technology, 64(4-5), 169-180.

This paper reviews how machine learning can help visualization research and how visualization helps ML research.

## 2022-10-22 #

1. Yu, Y., Cheung, P. Y., Ahn, Y. Y., & Dhillon, P. (2022). Unique in what sense? Heterogeneous relationships between multiple types of uniqueness and popularity in music . arXiv preprint arXiv:2207.12943.

This study looks at the relationship between song novelty and popularity. It found that in general song novelty and popularity are negatively related to each other; there are some nuances, though.

1. Wexler, J., Pushkarna, M., Bolukbasi, T., Wattenberg, M., Viégas, F., & Wilson, J. (2019). The what-if tool: Interactive probing of machine learning models. IEEE transactions on visualization and computer graphics, 26(1), 56-65.

I don’t quite understand this paper. It requires significant knowledge about machine learning. I know the gist of it. It is proposing a ‘what-if’ visualization tool for machine learning.

1. Yu, Y., Hao, Y., & Dhillon, P. (2022). Unpacking Gender Stereotypes in Film Dialogue . In International Conference on Social Informatics (pp. 398-405). Springer, Cham.

The authors studied gender differences in movie dialogues. They studied these differences: degree of assertion, degree of confirmation, valence of emotions, and the topic. They found that the valence of emotions show the greatest gender differences.

1. Meng, Y., Wu, W., Wang, F., Li, X., Nie, P., Yin, F., … & Li, J. (2019). Glyce: Glyph-vectors for chinese character representations . Advances in Neural Information Processing Systems, 32.

This paper uses CNN to represent Chinese characters. This method outperforms ID based ones.

# 2022-09 #

## 2022-09-12 (Completed on 2022-10-03) #

1. Antoniak, M., & Mimno, D. (2021, August). Bad seeds: Evaluating lexical methods for bias measurement . In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1889-1904).

This paper assessed the seeds used in bias measurements. The authors find that the encoding of biases and linguistic features in the seeds affects bias measurements.

1. Antoniak, M., Mimno, D., & Levy, K. (2019). Narrative paths and negotiation of power in birth stories . Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1-27.

This is a unique study. The authors analyzed over 2K birth stories computationally.

1. Antoniak, M., & Mimno, D. (2018). Evaluating the stability of embedding-based word similarities . Transactions of the Association for Computational Linguistics, 6, 107-119.

This study finds that word embeddings are sensitive to variations in the source texts. This affects small corpora the most. The authors suggest calculating the average distance over many bootstrap samples.

## 2022-09-11 (Completed on 2022-10-01) #

1. Battle, L., Feng, D., & Webber, K. (2021). Exploring visualization implementation challenges faced by D3 users online . arXiv preprint arXiv:2108.02299.

I really like this study. The authors analyzed the posts on Stack Overflow about D3.js. They analyzed these posts from two angles: compatibility (how D3 integrates with other tools) and debugging.

I took a look at their codebook . They must have done substantial work behind the scenes.

1. Yu, Y., Cheung, P. Y., Ahn, Y. Y., & Dhillon, P. (2022). Unique in what sense? Heterogeneous relationships between multiple types of uniqueness and popularity in music . arXiv preprint arXiv:2207.12943.

In this paper, the authors analyzed the relationship between song novelty and popularity and found a negative relationship.

## 2022-09-10 (Completed on 2022-09-22) #

This is innovative work. Humans learn from interacting with other humans. Shouldn’t AI do the same? The authors proposed the concept of “socially situated artificial intelligence”. They let AI interact with humans on social media through AI posting questions about images using human language and human responding (or not responding) as comments. They treated it as an iterative reinforcement learning problem. The field experiment lasted for eight months and reached 236K social media users. The results show that this method is better than the baseline.

AlphaFold is indeed a breakthrough.

## 2022-09-09 (Completed on 2022-09-21) #

This is a nice piece. The author talks about the trend that AI researchers blend deep learning with classic AI methods.

It introduces “intuitive physics”, for example, two objects cannot be at the same place at the same time; things will fall if it is not supported by something. I feel that this is similar to “Common knowledge.”

Deep learning systems may distinguish between sofas and chairs but they won’t know that these objects are for people to sit on.

## 2022-09-08 (Completed on 2022-09-20) #

1. Rind, A., Aigner, W., Wagner, M., Miksch, S., & Lammarsch, T. (2016). Task cube: A three-dimensional conceptual space of user tasks in visualization design and evaluation . Information Visualization, 15(4), 288-300.

This is a cool project. The authors posited that visualization tasks can be seen as residing in a three dimensional space: Abstraction (from abstract to concrete), Composition (from low-level to high-level), and Perspective (objectives versus actions). The first two are continuous whereas the last one is dichotomous.

Then the authors reviewed task related literature and examined the relationship between them and the model proposed here.

I saw that AI will be more widely used in biological science.

I believe there must be many stories on Mars in the past.

## 2022-09-07 (Completed on 2022-09-19) #

1. Dodge, J., Sap, M., Marasović, A., Agnew, W., Ilharco, G., Groeneveld, D., … & Gardner, M. (2021). Documenting large webtext corpora: A case study on the colossal clean crawled corpus . arXiv preprint arXiv:2104.08758.

This paper provides a descriptive analysis of the texts in the Colossal Clean Crawled Corpus.

1. McColeman, C. M., Yang, F., Brady, T. F., & Franconeri, S. (2021). Rethinking the ranks of visual channels . IEEE Transactions on Visualization and Computer Graphics, 28(1), 707-717.

I like this study a lot. The current ranking of visual channels is based on the task of comparing two values. However, this task is not that important in real settings. The authors instead used the task of reproducing a chart right after seeing it. The results show that even if there are only two marks in the chart, the existing ranking does not hold. Also, the ranking changes as there are more marks in the chart.

## 2022-09-06 (Completed on 2022-09-19) #

1. Hassani, K., & Lee, W. S. (2016). Visualizing natural language descriptions: A survey . ACM Computing Surveys (CSUR), 49(1), 1-34.

This is a cool survey. The authors reviewed past research on creating visualizations based on natural language. I am amazed by the creativity and vividness of those creations.

1. Talbot, J., Lee, B., Kapoor, A., & Tan, D. S. (2009, April). EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers . In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1283-1292).

The authors introduced an interactive visualization system for confusion matrix in multiple classification tasks. They did a user study and found that it is helpful to users.

## 2022-09-05 (Completed on 2022-09-16) #

1. Buolamwini, J., & Gebru, T. (2018, January). Gender shades: Intersectional accuracy disparities in commercial gender classification . In Conference on fairness, accountability and transparency (pp. 77-91). PMLR.

This is a cool paper. The authors analyzed the accuracy of gender prediction of three commercial gender prediction systems and found that prediction is worse for females, and darker faces. It is the worst for darker female faces.

This study tested the strength of weak ties empirically using 20 million people’s data from LinkedIn. They varied the prevalence of weak ties in their recommender system (People You May Know). They tested whether weak ties are related to job mobility. The results supported the weak tie theory but with three major revisions. First, the relationship between the strength of weak ties and job transmission shows an inverted U shape. Second, moderately weak ties and the weakest ties created the most job mobility. Third, the strength of weak ties is dependent on the type of the industry. Weak ties created job mobility in digital industries but not so much in other industries.

This study introduces the above work. I guess Wang Uzzi is among the reviewers of it. From reading it, I started thinking about one question I had when reading the above work: it is interesting that people are willing to help almost strangers. I say “almost” because, for example, you only have one friend in common and you are unfamiliar with each other. This is fascinating for me. This somehow shows that the society is not as “cold” as we thought.

I didn’t know Ukraine is so beautiful and culturally rich.

## 2022-09-04 (Completed on 2022-09-06) #

1. Horton, R. (2003). Medical journals: evidence of bias against the diseases of poverty. The Lancet, 361(9359), 712-712.

There is an ethnic bias in medical journals in terms of the make-up of editorial boards and also who are reported in these journals.

1. Ma, K. L. (2007). Machine learning to boost the next generation of visualization technology . IEEE Computer Graphics and Applications, 27(5), 6-9.

This paper argues that machine learning can drive the innovations in visualization research.

## 2022-09-03 (Completed on 2022-09-05) #

1. Stokes, C., Setlur, V., Cogley, B., Satyanarayan, A., & Hearst, M. (2022). Striking a Balance: Reader Takeaways and Preferences when Integrating Text and Charts . arXiv preprint arXiv:2208.01780.

This is a unique study. The authors looked at whether annotations and texts in line charts are preferred by viewers. It turns out that viewers love a large number of annotations over few annotations or texts alone.

1. Stanfill, M. (2012). Finding birds of a feather: Multiple memberships and diversity without divisiveness in communication research . Communication Theory, 22(1), 1-24.

This article argues that instead of looking of Communication from the aspect of topics alone in terms of deciding whether the field is fragmented, it might be better to look at these three aspects:

1. Methodology: How to measure the world

2. Ontology/epistemology: “What is real and what can we know about it”

3. Axiology: Ethical obligations

4. Chan, C. H., & Grill, C. (2022). The highs in communication research: Research topics with high supply, high popularity, and high prestige in high-impact journals . Communication Research, 49(5), 599-626.

This study looks at popular topics in popular communication journals and analyzes the citation network.

## 2022-09-02 (Completed on 2022-09-03) #

The author argues that deep learning is not generalizable and therefore not trusted. He proposed a mixed method: combining classical AI and deep learning. The author believes that reasoning and knowledge should be the priority if we are going to move forward in AI.

One possible step is to derive cognitive models or world models from texts.

1. Ioannidis, J. P., Bendavid, E., Salholz-Hillel, M., Boyack, K. W., & Baas, J. (2022). Massive covidization of research citations and the citation elite . Proceedings of the National Academy of Sciences, 119(28), e2204074119.

Covid research seems to have been very popular.

1. Saad-Falcon, J., Shaikh, O., Wang, Z. J., Wright, A. P., Richardson, S., & Chau, D. H. (2020). PeopleMap: Visualization Tool for Mapping Out Researchers using Natural Language Processing . arXiv preprint arXiv:2006.06105.

This is a cool project. The aim of this paper is to project researchers into a 2D plane based on their research interests and publications.

# 2022-08 #

## 2022-08-31 (Completed on 2022-09-03) #

Marcus, G. (2020). The next decade in ai: four steps towards robust artificial intelligence . arXiv preprint arXiv:2002.06177.

PP. 1-17

## 2022-08-30 (Completed on 2022-09-01) #

1. Chatzimparmpas, A., Martins, R. M., Jusufi, I., & Kerren, A. (2020). A survey of surveys on the use of visualization for interpreting machine learning models . Information Visualization, 19(3), 207-233.

This is a cool paper. The authors analyzed the survey papers on the use of visual analytics for machine learning interpretation.

1. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021, March). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?🦜 . In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610-623).

This paper talks about the trend of and the associated risks of ever growing language models. Solutions were also proposed. For example, researchers should first evaluate environmental and financial cost.

1. Li, W., Zhang, S., Zheng, Z., Cranmer, S. J., & Clauset, A. (2022). Untangling the network effects of productivity and prominence among scientists . Nature communications, 13(1), 1-11.

This is important work. The results show that collaboration networks can be viewed as a social capital that is distributed in an unequal and gendered way. The study shows that gender differences in productivity and prominence in mid-age researchers can be explained by the collaboration networks. The results also show that collaborating with senior authors is correlated with higher productivity and prominence of the junior authors.

## 2022-08-29 (Completed on 2022-08-31) #

1. Schloss, K. B., Leggon, Z., & Lessard, L. (2020). Semantic discriminability for visual communication . Ieee transactions on visualization and computer graphics, 27(2), 1022-1031.

This study is very unique. The authors show participants a series of bar charts. Each bar chart contains two bars of different colors. Participants were told that these two bars represent two fruits. However, the bars are not labeled. Participants were asked to indicate which bar indicates which fruit. The purpose is to see the effect of colors on people’s interpretation of the visualization.

1. Waldner, M., Diehl, A., Gračanin, D., Splechtna, R., Delrieux, C., & Matković, K. (2019). A comparison of radial and linear charts for visualizing daily patterns . IEEE transactions on visualization and computer graphics, 26(1), 1033-1042.

This study looked at whether click-like radial charts are better than line and bar charts in visualizing daily patterns.

## 2022-08-28 (Completed on 2022-08-31) #

Schöffel, S., Schwank, J., & Ebert, A. (2016, July). A user study on multivariate edge visualizations for graph-based visual analysis tasks . In 2016 20th International Conference Information Visualisation (IV) (pp. 165-170). IEEE.

To me, this study looks very strange. They designed two very strange ways of presenting bar charts and examined, through human subjects, which way is better.

## 2022-08-27 (Completed on 2022-08-30) #

1. Günther, E., & Domahidi, E. (2017). What communication scholars write about: An analysis of 80 years of research in high-impact journals . International journal of communication, 11, 21.

This is an interesting work. The authors did topic modeling on 15K papers from the 1930s published in 19 major communication journals. They analyzed the temporal trend. They find that, unsurprisingly, the internet and new media is becoming a hot topic.

1. Hwang, J. D., Bhagavatula, C., Le Bras, R., Da, J., Sakaguchi, K., Bosselut, A., & Choi, Y. (2021). On symbolic and neural commonsense knowledge graphs .

This study proposes an evaluation framework that tests the utility of knowledge graphs based on how effectively we can learn implicit knowledge from them.

## 2022-08-26 (Completed on 2022-08-30) #

1. Yang, W., Ye, X., Zhang, X., Xiao, L., Xia, J., Wang, Z., … & Liu, S. (2022). Diagnosing Ensemble Few-Shot Classifiers . arXiv preprint arXiv:2206.04372.

The authors designed a visualization system that aids few-shot learning.

I couldn’t really understand this paper. I only know the two basic principles of intelligence: parsimony and self-consistency.

## 2022-08-25 (Completed on 2022-08-29) #

1. Li, Z., Wang, X., Yang, W., Wu, J., Zhang, Z., Liu, Z., … & Liu, S. (2022). A unified understanding of deep NLP models for text classification . IEEE Transactions on Visualization and Computer Graphics.

I was not able to fully understand this article since I am not an expert on NLP. I got the gist of this paper: the authors designed a visualization system that can help NLP researchers get a deeper understanding of the NLP language models.

1. Yuan, J., Chen, C., Yang, W., Liu, M., Xia, J., & Liu, S. (2021). A survey of visual analytics techniques for machine learning . Computational Visual Media, 7(1), 3-36.

In this paper, the authors reviewed papers which use visual analytics to help machine learning. They classified these papers into three categories: before, during, and after ML model building.

## 2022-08-24 (Completed on 2022-08-29) #

Quadri, G. J., & Rosen, P. (2021). A survey of perception-based visualization studies by task . IEEE Transactions on Visualization and Computer Graphics.

This is an ambitious study. This survey is centered around ten low-level tasks related to visualizations. For each task, the authors reviewed how different visual encodings, e.g., position, color, length, area, etc, may aid this task and how different types of chart facilitate this task.

BTW, the interactive system build in this study is really cool: https://usfdatavisualization.github.io/VisPerceptionSurvey/

## 2022-08-23 (Completed on 2022-08-28) #

1. Chuang, J., Manning, C. D., & Heer, J. (2012, May). Termite: Visualization techniques for assessing textual topic models . In Proceedings of the international working conference on advanced visual interfaces (pp. 74-77).

This is a cool project. The authors created Teermite, a visualization system that aids topic modeling.

1. Ma, Y., Tsao, D., & Shum, H. Y. (2022). On the principles of Parsimony and Self-consistency for the emergence of intelligence . Frontiers of Information Technology & Electronic Engineering, 1-26.

pp. 1-4

## 2022-08-22 (Completed on 2022-08-28) #

1. Chuang, J., Ramage, D., Manning, C., & Heer, J. (2012, May). Interpretation and trust: Designing model-driven visualizations for text analysis . In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 443-452).

This is an influential project. The authors proposed a text visualization system. Specifically, they designed a system that shows over 9K PhD dissertations at Stanford University. This system aims to help people understand and validate the output of different data-driven text models.

1. Chuang, J., Manning, C. D., & Heer, J. (2012). “Without the clutter of unimportant words” Descriptive keyphrases for text visualization . ACM Transactions on Computer-Human Interaction (TOCHI), 19(3), 1-29.

This is a very interesting study. The authors asked human judges to produce keyphrases for texts. Then, the authors analyzed the statistical and linguistic features of these chosen key terms and decided which features predict the quality of the keyphrases. With this model, they can produce high quality keyphrases automatically for any texts.

## 2022-08-21 (Completed on 2022-08-27) #

1. Paik, H., & Marzban, C. (1995). Predicting television extreme viewers and non-viewers: a neural network analysis . Human Communication Research, 22(2), 284-306.

This is a very surprising study for me. It is surprising because I didn’t know neural networks were used in social science, or in Communication, as early as in the 1990s!

I really liked the abstract of this paper. In fact, its abstract is the best among all papers I’ve seen! The aim of this paper is to predict what variables predict television nonviewers and what predict extreme viewers. The data was from the General Social Surveys from three years: 1988, 1989, and 1990. The authors performed prediction using neural networks and also a discriminant analysis. The results show that neural networks outperform discriminant analyses.

The analysis results show that demographics were strong predictors of nonviewers and family related and lifestyle related variables were strong predictors of extreme viewers.

1. Vorderer, P. (2016). Communication and the good life: Why and how our discipline should make a difference . Journal of Communication, 66(1), 1-12.

This is an inspiring piece. With current technologies, we are connected with people anywhere and anytime. But the author askes, is this good? It might not be good. It might be an illusion that we are in contact with so many people; only few of them really care about you. You are in fact alone in the world.

The authors also talked about these three points: 1) We embrace new technologies; 2) We get a lot from technologies and might be happy with it (otherwise, why are we still using it); 3) However, research shows that productivity, mindfulness and sleep quality improve when we do not use smartphones anytime and anywhere.

This seems to be a puzzle.

Inspired by this work, I am thinking of studying whether machine learning is doing any good to our lives.

## 2022-08-20 (Completed on 2022-08-27) #

1. McFarland, D. A., Ramage, D., Chuang, J., Heer, J., Manning, C. D., & Jurafsky, D. (2013). Differentiating language usage through topic models . Poetics, 41(6), 607-625.

This paper talks about how they used several different methods of topic modeling in their projects, and how they validated the results.

1. Su, M., Peng, H., & Li, S. (2021). A visualized bibliometric analysis of mapping research trends of machine learning in engineering (MLE) . Expert Systems with Applications, 186, 115728.

This paper did a scientometric analysis of papers in the field of machine learning in engineering. The data was from the Web of Science.

## 2022-08-19 (Completed on 2022-08-26) #

Ha, D., & Schmidhuber, J. (2018). Recurrent world models facilitate policy evolution . Advances in neural information processing systems, 31.

I read the interactive version of this paper but I still found it difficult to really understand.

## 2022-08-18 (Completed on 2022-08-26) #

Ha, D. (2017). A visual guide to evolution strategies . blog. otoro. net.

This is a fun paper. It introduces several different evolution algorithms. The author tested these algorithms on the MNIST dataset and it seems the result is very good.

## 2022-08-17 (Completed on 2022-08-25) #

1. Finished Breiman, L. (2001)

This paper compares two ways of prediction: relying on presumed models, or building models from data. The authors also introduced decision trees, random forests, and support vector machines.

The authors argue that although machine learning methods are black boxes which are not interpretable, it does not mean that they are inferior to interpretable data models which are less accurate. He believes that the point is not interpretability, but rather getting accurate information, or prediction.

He warns that a large amount of data is being produced daily, and statistics has to use other tools to solve problems people face rather than sticking to only familiar tools such as data modeling.

1. Matuszek, C., Witbrock, M., Kahlert, R. C., Cabral, J., Schneider, D., Shah, P., & Lenat, D. (2005). Searching for common sense: Populating cyc from the web .

This paper talked about how to use Google to facilitate adding common sense to the database of CYC.

## 2022-08-16 (Completed on 2022-08-24) #

Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author) . Statistical science, 16(3), 199-231.

PP. 1-10

## 2022-08-15 (Completed on 2022-08-24) #

Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., & Gao, J. (2021). Deep learning–based text classification: a comprehensive review . ACM Computing Surveys (CSUR), 54(3), 1-40.

This is a very impressive review. The authors reviewed the popular models and introduced popular datasets for text classification.

## 2022-08-14 (Completed on 2022-08-23) #

1. Jha, R., Jbara, A. A., Qazvinian, V., & Radev, D. R. (2017). NLP-driven citation analysis for scientometrics . Natural Language Engineering, 23(1), 93-130.

This is a fun paper. The authors talk about the citation purpose and citation polarity (i.e., neutral, positive and negative). The authors identified six citation purposes: criticism, comparison, use, substantiating, basis, and neutral. They talked about how to use NLP techniques to label citation purposes and citation polarity.

## 2022-08-13 (Completed on 2022-08-20) #

Wolchover, N. (2020). Artificial Intelligence Will Do What We Ask. That’s a Problem.

This is a powerful article. It introduces the ideas by Stuart Russel, a CS professor at UC Berkeley. He worries that if machines are only programmed to maximize a certain goal, then the world will be in trouble. He proposed three “principles of beneficial machines”:

1. Machines should be programmed to maximize human preferences.
2. Machines are uncertain about what human preferences are.
3. Machines should learn from human behavior about human preferences.

Russel also believes that the “off-switch” problem is at the core of our control over machines. If we are unable to switch a machine off when it is maximizing its goals, we would be in trouble.

There are also some problems with the three principles above:

1. Humans don’t know what their preferences are. Even if they do, their actions do not necessarily align with those preferences.
2. Humans' preferences change.
3. Good people have preferences. Bad people do as well. What if machines learn from bad people?

## 2022-08-12 (Completed on 2022-08-19) #

1. Walter, N., Cody, M. J., & Ball-Rokeach, S. J. (2018). The ebb and flow of communication research: Seven decades of publication trends and research priorities. Journal of communication, 68(2), 424-440.

This is a fun project. The authors analyzed the research topics and also authors. They identified the research topic trends and found that most authors from 1951 to 2016 were from the United States (83%) and academia (94%).

1. Ephrat, A., & Peleg, S. (2017, March). Vid2speech: speech reconstruction from silent video . In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5095-5099). IEEE.

I was amazed by this paper. It shows that to reconstruct speech based on silent videos, it is important to analyze facial visuals. Only focusing on the mouth areas will decrease the accuracy. The performance of this paper seems to be much higher than the previous ones. I am wondering why it is not widely cited. Maybe other important results were published after this paper, making this paper not so important.

## 2022-08-11 (Completed on 2022-08-17) #

1. McCarthy, J., & Hayes, P. J. (1981). Some philosophical problems from the standpoint of artificial intelligence . In Readings in artificial intelligence (pp. 431-450). Morgan Kaufmann.

I didn’t read the whole paper. It is a very influential paper. It talked about the philosophical questions confronting AI. Maybe I’ll revisit this paper.

1. Kim, M., Yoon, J., Jung, W. S., & Kim, H. (2022). Quantifying the topic disparity of scientific articles . arXiv preprint arXiv:2202.03805.

This is an interesting paper. The authors looked at how the conventionality of the paper is correlated with the citations that it receives. They measured the conventionality of a paper as the cosine distance of this paper and its discipline on a vector space. They found that less conventional papers receive fewer citations.

## 2022-08-10 (Completed on 2022-08-16) #

1. Markowitz, D. M., Song, H., & Taylor, S. H. (2021). Tracing the adoption and effects of open science in communication research . Journal of Communication, 71(5), 739-763.

This paper examines the open science practices in the field of Communication. The authors found that there is still much work to be done.

1. Jin, Z., Chauhan, G., Tse, B., Sachan, M., & Mihalcea, R. (2021). How good is NLP? a sober look at NLP tasks through the lens of social impact . arXiv preprint arXiv:2106.02359.

This paper is different from most NLP papers. It talked about how to estimate the contribution of NLP papers for social good. It proposed a framework to do such a job.

1. Fox, J., Pearce, K. E., Massanari, A. L., Riles, J. M., Szulc, Ł., Ranjit, Y. S., … & L. Gonzales, A. (2021). Open science, closed doors? Countering marginalization through an agenda for ethical, inclusive research in communication. Journal of Communication, 71(5), 764-784.

This paper talked about the counteractive effects of open science and discussed possible solutions.

1. de Oliveira, T. M., Marques, F. P. J., Veloso Leão, A., de Albuquerque, A., Prado, J. L. A., Grohmann, R., … & Guazina, L. S. (2021). Towards an inclusive agenda of open science for communication research: A Latin American approach . Journal of Communication, 71(5), 785-802.

This article talked about open science from the perspective of Latin American scholars. The authors state that open science is now understood in two ways: replication, and inclusion. For the first interpretation, the authors argue that standardizing the scientific writing process may overlook the diversities in the world. The authors then provided the viewpoints of Latin American scholars' on open science.

## 2022-08-09 (Completed on 2022-08-15) #

1. Gurcan, F., Cagiltay, N. E., & Cagiltay, K. (2021). Mapping human–computer interaction research themes and trends from its existence to today: A topic modeling-based review of past 60 years . International Journal of Human–Computer Interaction, 37(3), 267-280.

In this paper, the authors analyzed the trends of HCI topics by topic modeling HCI papers in the past six decades. They found that Brain-Computer Interface is emerging fast, and that human-robot interaction and Mobile are middle-aged topics which are also accelerating.

1. Schlesinger, A., O’Hara, K. P., & Taylor, A. S. (2018, April). Let’s talk about race: Identity, chatbots, and AI . In Proceedings of the 2018 chi conference on human factors in computing systems (pp. 1-14).

This is a very different paper. It talks about how to solve the problem of racism in chat bots. The story of Microsoft’s Tay showed that chatbots can be manipulated. The easy way to protect chatbots from racism is using a blacklist of words. However, this is not the ultimate solution. For example, “Pakistan'' might be included in the blacklist. However, doing so will make lives inconvenient for Pakistanis. This paper talked about how to solve this problem utilizing NLP, machine learning and deep learning technology.

## 2022-08-08 (Completed on 2022-08-11) #

Hermann, K. M., Hill, F., Green, S., Wang, F., Faulkner, R., Soyer, H., … & Blunsom, P. (2017). Grounded language learning in a simulated 3d world. arXiv preprint arXiv:1706.06551.

This is a very cool project. The authors build artificial agents to learn human language in a 3D space. I feel this is a future direction as unless AI can relate human language to the real world, we are not able to fully trust the decisions made by AI.

## 2022-08-07 (Completed on 2022-08-09) #

1. Finished Machine behaviour

There is a conference called ACM FAccT (ACM Conference on Fairness, Accountability, and Transparency) . It seems to be really cool.

We can study human behavior, and by the same token, we can study machine behavior. AI agents are ubiquitous, and are impacting our day-to-day lives, for everyone of us. The authors came up with three motivations behind the new field called machine behavior that they are proposing:

1. AI agents are becoming ubiquitous.
2. It’s hard to predict what outcomes AI agents generate. AI models are becoming increasingly complex. Even if the algorithms are simple, its outcome may still be complicated. Also, many algorithms in use today are proprietary and not available to the public, making it harder for us to know what outcomes these models will generate.
3. Algorithms are impacting our lives. We need to study their effects on humanity.

What topics or dimensions should we study in the field of machine behavior?

1. Mechanisms behind the behavior. For this, we need more interpretability.

2. How the behavior might change. For example, new data is fed, or the model updates itself from interacting with the world (like reinforcement learning).

3. Functions: what functions do these algorithms fulfill for human creators, for example, companies? These help us understand why some algorithms prevail while others fade away.

4. Schlesinger, A., Edwards, W. K., & Grinter, R. E. (2017, May). Intersectional HCI: Engaging identity through gender, race, and class . In Proceedings of the 2017 CHI conference on human factors in computing systems (pp. 5412-5427).

This paper talks about intersectionality in CHI proceedings. The authors found around 150 past CHI papers about this topic. The findings of this research are: 1) researchers tend to focus on one aspect of user identity rather than intersectionality; 2) research on race is relatively less than that on gender and class.

## 2022-08-06 #

Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J. F., Breazeal, C., … & Wellman, M. (2019). Machine behaviour . Nature, 568(7753), 477-486.

PP. 1-5

## 2022-08-05 (Completed on 2022-08-06) #

Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). Squad: 100,000+ questions for machine comprehension of text . arXiv preprint arXiv:1606.05250.

This is indeed meaningful and impactful work.

Why is the Stanford QA needed? To train data-intensive models, we need a large amount of high-quality data. However, the existing datasets are either large but low-quality or high-quality but small.

The team collected English articles from Wikipedia and asked crowd workers to read a passage in four minutes and come up with five questions. They can ask hard questions and are encouraged to use their words in the question. The answer for each question must be directly from the article (in fact, the crowd workers are asked to highlight the answer from the article). Each answer does not have to be a single word; it can be a long phrase.

In order to make it more robust, for development and testing dataset, the team asked other crowd workers to select the shortest span that can answer the question. That is to say, for the development and testing dataset, some workers previously have already proposed questions and provided answers. Now, the authors asked other workers to provide possibly shorter answers. This is because in testing, if the machine chooses a shorter span (compared to the first answer) which is actually correctly answering the question, it will also be labeled as correct.

Based on the data, the authors came up with a logistic regression which has a F1 score of 50%. The human performance is 86.8% (F1 score).

## 2022-08-04 #

Heimerl, F., & Gleicher, M. (2018, June). Interactive analysis of word vector embeddings . In Computer Graphics Forum (Vol. 37, No. 3, pp. 253-265).

This is a cool paper. I didn’t understand every detail of it but I know what it is doing. It is basically creating a design space for tasks associated with word embeddings. They came up with three visualization designs that address common word embedding tasks.

## 2022-08-03 #

Gunaratne, S. A. (2009). Globalization: A non-Western perspective: The bias of social science/communication oligopoly . Communication, Culture & Critique, 2(1), 60-82.

This is indeed a very unique paper. The authors argue that today’s social science is dominated by Western philosophy. Eastern thoughts, specifically those of Buddhist and Taoism, are largely ignored. The author argues that as the world powers are shifting from Europe and the US to Asia (China, Japan and India), it might be helpful to think about how social science can be done with an Eastern mind.

## 2022-08-02 #

1. Heer, J., & Bostock, M. (2010, April). Crowdsourcing graphical perception: using mechanical turk to assess visualization design . In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 203-212).

The main aim of this project is to test whether MTurk is a viable platform to conduct experiments. The authors replicated seminal experiments by Cleveland & McGill (1984) and also conducted other experiments. They found that the results by Turkers are reliable and believe that using MTurk can lower the cost and time required for conducting experiments.

1. Mohammad, S. (2020, May). Nlp scholar: A dataset for examining the state of nlp research . In Proceedings of the 12th Language Resources and Evaluation Conference (pp. 868-877).

I really like this study. It extracts around 40K NLP paper information from ACL and then combines the data with Google Scholar. This dataset is very useful for scientometric analysis on NLP.

## 2022-08-01 #

Talbot, J., Setlur, V., & Anand, A. (2014). Four experiments on the perception of bar charts . IEEE transactions on visualization and computer graphics, 20(12), 2152-2160.

This study in some sense replicated the experiments about bar charts in the classical study by Cleveland and McGill (1984). In the original study by Cleveland and McGill (1984), there are five bar chart types: adjacent bars, separated bars, aligned stacked bars, unaligned stacked bars, and divided bars. Adjacent bars lead to the least errors in bar height comparison tasks and divided bars lead to the highest errors.

However, Cleveland and McGill (1984) did not study why so. In the current study, the authors wanted to examine the mechanisms behind all these. The results generally confirmed the accuracy ranking in Cleveland and McGill (1984). The authors provided explanations of the error sources. For example, they found that separation between bars makes the comparisons difficult.

The authors argue that even for simple charts like bar charts, we still do not understand them fully.

# 2022-07 #

## 2022-07-31 #

Cleveland, W. S., & McGill, R. (1984). Graphical perception: Theory, experimentation, and application to the development of graphical methods . Journal of the American statistical association, 79(387), 531-554.

This is a canonical study.

The idea of this paper is this: rather than designing visualizations based on common sense or intuition, we need to design them based on scientific theories. The authors proposed ten elementary perceptual tasks as shown in Fig. 1. Some examples: (1) position along a common scale, (2) positions along nonaligned scales, (3) length, direction, angle, (4) area, (5) volume, curvature, and (6) shading, color saturation. The tasks are ordered such that the first leads to the most accurate perception. The authors suggest that when we design charts, it is better to use tasks that lead to more accurate perceptions.

## 2022-07-30 #

Skau, D., Harrison, L., & Kosara, R. (2015, June). An evaluation of the impact of visual embellishments in bar charts . In Computer Graphics Forum (Vol. 34, No. 3, pp. 221-230).

This is a cool study. THe authors studied the visual embellishments of bar charts. They identified six types of embellishments: rounded tops, triangle bars, capped bars, overlapping triangle bars, quadratically increasing area bars, and bars extending below zero. They compared these designs with the baseline bar chart.

To compare the effectiveness, they recruited 100 users from MTurk. These Turkers completed two tasks: identify absolute values of a bar, and estimate the percentage of bar to another one. The results show that these embellishments hinder, rather than help, users understand data presented in the bar charts.

## 2022-07-29 #

1. Krishnamoorthy, N., Malkarnenkar, G., Mooney, R., Saenko, K., & Guadarrama, S. (2013, June). Generating natural-language video descriptions using text-mined knowledge . In Twenty-Seventh AAAI Conference on Artificial Intelligence.

This paper introduces a new method of providing image captions. It first identifies objects, basically a subject and an object, for example, a person and a motorbike. Then it uses knowledge extracted from web texts to put these two elements together, producing a highly probable caption such as “A person is riding a motorbike”.

I agree that this method is innovative but it is not trustworthy at all. We can only get the most likely combination of two objects (if we identify them correctly), but this combination is not necessarily reflective of what is really going on in the video.

1. Ribeiro, M. T., Wu, T., Guestrin, C., & Singh, S. (2020). Beyond accuracy: Behavioral testing of NLP models with CheckList . arXiv preprint arXiv:2005.04118.

This is indeed innovative and influential work. The researchers came up with an interactive tool that generates test examples for NLP models. This method is applicable to most NLP models, either commercial or academic. This model involves (1) capabilities such as Negation, Coreference, Logic, Robustness to typos and other errors, Fairness, etc, and (2) Test types (Minimum Functionality tests, Invariance, and Directional Expectation tests).

They found that this CheckList is able to identify bugs even in models that have been extensively used and debugged.

## 2022-07-28 #

Srinivasan, A., Brehmer, M., Lee, B., & Drucker, S. M. (2018, April). What’s the difference? evaluating variations of multi-series bar charts for visual comparison tasks . In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-12).

This study compares four designs of grouped bar charts for comparison tasks. The four designs are: a simple grouped bar chart; grouped bar chart with juxtaposed differences; single bar chart with juxtaposed differences; difference charts. The results show that bar charts with juxtaposed differences are better for comparison tasks.

## 2022-07-27 #

Karduni, A., Wesslen, R., Cho, I., & Dou, W. (2020, April). Du bois wrapped bar chart: Visualizing categorical data with disproportionate values . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-12).

When values to present have large gaps, i.e., the biggest one is way larger than the smallest one, it is challenging to present all these values through a bar chart. Two problematic solutions are: 1) breaking the axis; 2) setting an upper bound for the axis.

This study found that wrapped bar charts enable users to perform better on identification and comparison tasks when the data to visualize is disproportional, but may come with a price: more time spent and higher cognitive load.

## 2022-07-26 #

De Cao, N., Aziz, W., & Titov, I. (2021). Editing factual knowledge in language models . arXiv preprint arXiv:2104.08164.

The purpose of this paper is this: to modify specific memory in a pre-trained language model so that when prompted, it will give the corrected output. This is important because the predictions by the original pre-trained language model might be outdated. It is computationally expensive to retrain the model. It will be best if we can modify the results without training again. The authors of this paper treat this as a learning problem. They basically built a hyper-network, i.e., a neural network that predicts parameters of another neural network. With this, they can predict the corrected parameter and therefore the corrected output.

It turns out that this method is very good. Detailed results are shown in Table 1.

## 2022-07-25 #

I re-read: Heimerl, F., Chang, C. C., Sarikaya, A., & Gleicher, M. (2018). Visual designs for binned aggregation of multi-class scatterplots . arXiv preprint arXiv:1810.02445.

The authors want to create a design space for binned aggregation of multi-class scatterplot. They first came up with a taxonomy of common tasks associated with scatterplots (see Table 1). Then they used two example datasets to demonstrate how different designs of binned aggregation plots can facilitate different tasks.

Then they came up with a specific design space, as shown in Fig. 4.

The authors came up with a system or a demo at: https://graphics.cs.wisc.edu/Vis/binning/ .

### Question: #

1. I don’t understand Fig. 2. What are the x and y axes specifically?

## 2022-07-24 #

1. Zhou, L., Gao, J., Li, D., & Shum, H. Y. (2020). The design and implementation of xiaoice, an empathetic social chatbot . Computational Linguistics, 46(1), 53-93.

I didn’t read the details of this paper but I know it is about the design of Xiaoice. I instead read some interviews with one of the creators of Xiaoice, Di Li. I was amazed by the ability of Xiaoice.

1. Celikyilmaz, A., Clark, E., & Gao, J. (2020). Evaluation of text generation: A survey . arXiv preprint arXiv:2006.14799.

I didn’t read the details of this paper but I know it talks about the evaluation methods of natural language generation (NLG).

## 2022-07-23 #

1. Finished Brachman, R. J., & Levesque, H. J. (2022, June). Toward a New Science of Common Sense . In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 11, pp. 12245-12249).
• Having common sense is different from knowing more facts. Common sense is the ability to make inferences and decisions based on already known facts.

• For humans, common sense is not always active. Only when we meet unusual circumstances, do we resort to common sense. If common sense does not help us find the solution, we use deeper thinking instead.

1. Coupland, R., & Kobi-Renée, L. (2005). Science and prohibited weapons. Science, 308(5730), 1841-1841.

This editorial piece talks about how science can be misused to develop prohibited weapons as biological and chemical poisons.

1. Muise, D., Hosseinmardi, H., Howland, B., Mobius, M., Rothschild, D., & Watts, D. J. (2022). Quantifying partisan news diets in Web and TV audiences . Science Advances, 8(28), eabn0083.

This study shows that TV, rather than online news drives partisan audience segregation in the United States.

## 2022-07-22 #

1. Brachman, R. J., & Levesque, H. J. (2022, June). Toward a New Science of Common Sense . In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 11, pp. 12245-12249).

PP. 1-4

## 2022-07-21 #

1. Lai, G., Xie, Q., Liu, H., Yang, Y., & Hovy, E. (2017). Race: Large-scale reading comprehension dataset from examinations . arXiv preprint arXiv:1704.04683.

This paper introduces a new dataset called RACE for question answering. This dataset came from English tests for Chinese high school students. This dataset might be challenging for machines because there are many questions that involve reasoning and therefore cannot be easily answered by taking words from the original texts.

Right now, the highest accuracy achieved by machines is around 70%. It seems very impressive, given that it is the machines that are reading those passages. However, I still am not able to trust this kind of AI. It is still based on statistical learning. I am just wondering: for the next wave of AI revolution, is logical reasoning a necessity or not? Is it enough for us to rely on statistical learning?

1. Liu, J., Cui, L., Liu, H., Huang, D., Wang, Y., & Zhang, Y. (2020). Logiqa: A challenge dataset for machine reading comprehension with logical reasoning . arXiv preprint arXiv:2007.08124.

This is exciting work. The authors presented a new dataset for machine reading. This dataset is based on logical tests for civil servant tests in China. The authors translated those tests into English. The results show that machines' performance is much worse than humans. This shows that current nlp language models are not able to do logical inferences well.

1. Hutchinson, B., Smart, A., Hanna, A., Denton, E., Greer, C., Kjartansson, O., … & Mitchell, M. (2021, March). Towards accountability for machine learning datasets: Practices from software engineering and infrastructure . In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 560-575).

In this paper, the authors talk about what should be done before and along with a new dataset for machine learning.

## 2022-07-20 #

Chevalier-Boisvert, M., Bahdanau, D., Lahlou, S., Willems, L., Saharia, C., Nguyen, T. H., & Bengio, Y. (2018). Babyai: A platform to study the sample efficiency of grounded language learning . arXiv preprint arXiv:1810.08272.

I am not able to fully understand the details in this paper. The authors designed a platform to allow computers to understand language by incorporating humans in the training. They designed a 2D grid world and a language with clear structures. A bot agent was designed to play the role of a human teacher. They find that reinforcement learning performs poorly in letting machines learn the language in this physical 2D world.

## 2022-07-19 #

Clark, K., Khandelwal, U., Levy, O., & Manning, C. D. (2019). What does bert look at? an analysis of bert’s attention . arXiv preprint arXiv:1906.04341.

I am not able to fully understand this paper. I know it first had pre-trained language models and then applied these models to BERT. The purpose is to study the attention heads of BERT. The authors find that some attention heads of BERT correspond to syntax and coreference.

## 2022-07-17 (Completed on 2022-07-18) #

Zhu, Y., & Fu, K. W. (2019). The relationship between interdisciplinarity and journal impact factor in the field of Communication during 1997–2016 . Journal of Communication, 69(3), 273-297.

This is the study I would have conducted myself. In fact, I have been thinking about it for quite a while. That said, I am glad that this work has been done; I am able to focus on other topics.

This study relies on citation data of over 90 communication journals indexed in web of science. The results show that citations in and out of communication studies focus on only a tiny few other disciplines, such as social psychology, sociology, and political science. Also, although communication tends to cite other fields compared to sociology and political science, it is very weak at receiving citations from other fields.

Another finding is that the dominance of psychology in communication has been declining. Also, citing interdisciplinary fields outside of the social sciences tend to increase the visibility of communication studies compared to citing other fields.

## 2022-07-16 (Completed on 2022-07-17) #

Leins, K., Lau, J. H., & Baldwin, T. (2020). Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? . arXiv preprint arXiv:2005.13213.

This paper is a response to a paper that utilizes NLP to predict legal sentencing. The current paper talks about ethical concerns associated with this method.

I haven’t read the original paper this current paper is referring to but according to their descriptions, the paper seems horrible to me. I don’t think NLP has reached such an accuracy that people can rely on it to determine people’s life sentences.

## 2022-07-15 (Completed on 2022-07-16) #

Finished the book of Rebooting AI: Building artificial intelligence we can trust.

### Chapter 1 #

• The present AI is narrow; it can only do a specific task and not general or broad enough
• Our current approach to AI does not allow it to be true intelligence.
• People would believe machines understand them even if they somewhat know it’s impossible.

### Chapter 2 #

• Machine learning solely relying on training data isn’t reliable.

### Chapter 3 #

This chapter mainly introduces the history of machine learning, specifically the neural networks.

• Three drawbacks of deep learning

• Deep learning is greedy. It needs lots of data and relies on data. If the problem changes, deep learning no longer works.
• Deep learning is a black box.
• Deep learning is unreliable.
• Deep learning can recognize objects but won’t tell you the relations between them.

• Deep learning is more like an art than science.

• Reading and Robots are two of the most challenging AI domains.

### Chapter 4 #

• Deep learning today is not designed to understand human language. All that is involved is statistics.

• Computers can’t read because they don’t know how the world works.

### Chapter 5: Where’s Rosie #

• Machine learning can identify objects, but it does not know what all those objects mean and their roles in the real world.
• Both human language understanding and real robots need cognitive models that can adapt to the changing world.

### Chapter 6: Insights from the human mind #

• Causality is important.

• Currently, machine learning researchers don’t value ‘prior knowledge’. Instead, they value learning from a blank stage. This might be a mistake.

• After reading this passage, I am thinking of building a system where researchers can share their ‘models’ that have learned (different kinds of) common knowledge. This way, we can build on each other’s' work and gradually ‘learn’ about the real world.

• To be really intelligent, machines need to have a sense of three things: time, space, and causality.

• Future AI needs not only big data but also “abstract causal knowledge”.

### Chapter 7 Common Sense, and the path to deep understanding #

• It is super difficult to endow machines with human-like common sense such as “leaves swing if there is wind.”

• To allow machines to master common sense, we need two things: 1) know what kind of common sense is needed; 2) know how to encode/store learned common sense in machines.

• Diagrams are not going to work because there is so much more explicit knowledge not shown there.

• It is difficult for machines to infer the representation of time from a sentence.

• Current AI systems don’t have a sense of space. For example, it does not know that if you put a pen into a mesh bag, the pen will not stay in the bag.

• Simulations aren’t a good solution to causality reasoning in AI.

• To change the paradigm, we need a new kind of learning, one that builds upon existing knowledge, rather than starting from scratch every time.

### Chapter 8 Trust #

• We need new metrics to test general intelligence. Turing testing isn’t a good option.

• It will be dangerous for the machines to learn all ethical values from the world, as existing values such as biases based on gender and race, are obviously wrong.

### Epilogue #

In hindsight, the turning point will be seen not as the 2012 rebirth of deep learning, but as the moment at which a solution to the challenges of common sense and reasoning yields deeper understanding.

It is hard to predict what the future will be like once general intelligence is achieved. The sky’s the limit.

pp. 172-187

pp. 159-172

pp. 139-159

pp. 119-139

pp. 92-119

pp. 70-92

pp. 54-70

p. 41-53

pp. 18-41

# 2022-07-05 (Completed on 2022-07-07) #

Marcus, G., & Davis, E. (2019). Rebooting AI: Building artificial intelligence we can trust. Vintage.

pp. 5-18

## 2022-07-04 (Completed on 2022-07-06) #

I was not able to fully understand this paper. It is not for a beginner.

I was not able to fully understand this post.

## 2022-07-03 (Completed on 2022-07-05) #

Song, H., Eberl, J. M., & Eisele, O. (2020). Less fragmented than we thought? Toward clarification of a subdisciplinary linkage in communication science, 2010–2019 . Journal of communication, 70(3), 310-334.

I re-read this paper, trying to get a deeper understanding of it. This time, I know that they first identified around 110 topics, and then clustered them together. Later, they grouped these clusters into subfields. They then examined the subdisciplinary network against four types of networks and concluded that communication as a field is not as fragmented as we previously thought.

I still have the following questions:

1. How does CTM (correlated topic model) actually work?
2. If CTM tells us the correlation between topics, then how can we know the correlations between subdisciplines?
3. Why should we study the subdisciplinary network, but not the topic cluster network, against the four networks?

## 2022-07-02 (Completed on 2022-07-04) #

Shin, T., Razeghi, Y., Logan IV, R. L., Wallace, E., & Singh, S. (2020). Autoprompt: Eliciting knowledge from language models with automatically generated prompts . arXiv preprint arXiv:2010.15980.

I was not able to fully understand this paper but again, I got the gist of it: insteading of using manually generated prompts to test the capacity of language models to do downstream tasks, this paper designs a method to automatically generate prompts. Their prompts are better at testing the real capacity of language models.

Their results show that masked language models are able to do sentiment analysis and natural language inference without finetuning.

## 2022-07-01 (Completed on 2022-07-03) #

Min, S., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2021). Noisy channel language model prompting for few-shot text classification . arXiv preprint arXiv:2108.04106.

In this paper, the authors show that for text classification tasks, computing the probability of an input given an output (called the “channel model”) outperforms the traditional direct model where we compute the probability of an output given an input.

# 2022-06 #

## 2022-06-30 (Completed on 2022-07-02) #

1. Yang, S., Yim, J., Kim, J., & Shin, H. V. (2022, April). CatchLive: Real-time Summarization of Live Streams with Stream Content and Interaction Data. In CHI Conference on Human Factors in Computing Systems (pp. 1-20).

This is a cool project. The authors came up with CatchLive that can help viewers of live stream videos join in the live streaming anytime and get to know the highlights in previous parts while engaged in the current streaming.

1. Blackwell, A. F. (2015, April). HCI as an Inter-Discipline . In Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (pp. 503-516).

This is indeed interesting work. The author argues that HCI should not aspire to be a scientific discipline. Therefore, there is nothing wrong with CHI not having a core theme. Instead, the author argues that the value of HCI is, being interdisciplinary, to contribute to other disciplines.

## 2022-06-29 (Completed on 2022-07-01) #

1. #CommunicationSoWhite: Race and Power in the Academy and Beyond

1. de Albuquerque, A., de Oliveira, T. M., dos Santos Junior, M. A., & de Albuquerque, S. O. F. (2020). Structural limits to the de-westernization of the communication field: The editorial board in Clarivate’s JCR system . Communication, Culture and Critique, 13(2), 185-203.

This work examines the editorial board members in Comm journals indexed by Web of Science. Unsurprisingly, these board members are dominated by the US.

## 2022-06-28 (Completed on 2022-06-30) #

Cui, J., Zhang, T., Jaidka, K., Pang, D., Sherman, G., Jakhetiya, V., … & Guntuku, S. C. (2022, May). Social Media Reveals Urban-Rural Differences in Stress across China . In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 16, pp. 114-124).

This is a cool project. It shows that in rural China, people post on social media more about personal and emotional stress. whereas in urban China, people post more about stress regarding external events such as politics or economy.

## 2022-06-27 (Completed on 2022-06-29) #

Shaw, A., Scharkow, M., & Wang, Z. J. (2021). Opening a Conversation on Open Communication Research . Journal of Communication, 71(5), 677-685.

This is an editorial piece. I found it very interesting to read. This article summarizes the special issue on open science in Communication. Some of the key points in papers in this special issue:

• A survey of 1K communication scholars shows that questionable practices are common.
• A growing number of communication research papers are mentioning open science terms but little to no effect on how this influences papers' citations.
• We need to consider ethical issues in open science.
• Open science practices may marginalize scholars in the Global South.

## 2022-06-26 (Completed on 2022-06-28) #

Chakravartty, P., Kuo, R., Grubbs, V., & McIlwain, C. (2018). # CommunicationSoWhite . Journal of Communication, 68(2), 254-266.

This paper examines communication scholars' race. Besides the racial composition, it also looks at whether White scholars had more citations (Yes they did).

## 2022-06-25 (Completed on 2022-06-27) #

Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases . Science, 356(6334), 183-186.

This paper shows that human language corpora contains biases in human societies.

## 2022-06-24 (Completed on 2022-06-26) #

1. Wang, Y., Peng, T. Q., Lu, H., Wang, H., Xie, X., Qu, H., & Wu, Y. (2021). Seek for success: a visualization approach for understanding the dynamics of academic careers . IEEE Transactions on Visualization and Computer Graphics, 28(1), 475-485.

This study proposes a way to visualize scholars' academic careers.

1. Shen, Q., Wu, T., Yang, H., Wu, Y., Qu, H., & Cui, W. (2016). Nameclarifier: A visual analytics system for author name disambiguation . IEEE transactions on visualization and computer graphics, 23(1), 141-150.

This is a cool project. It aims to facilitate name disambiguation, a very important procedure in the study of sciences, through visualizations. The system also includes humans in the loop, which achieves higher accuracy than depending on machines alone.

1. Wu, T., Ribeiro, M. T., Heer, J., & Weld, D. (2019, January). Errudite: Scalable, reproducible, and testable error analysis . In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

This is a cool paper. The authors utilizes interactive visualization to facilitate error analysis in NLP tasks.

## 2022-06-23 (Completed on 2022-06-25) #

1. Kale, A., Wu, Y., & Hullman, J. (2021). Causal Support: Modeling Causal Inferences with Visualizations . IEEE transactions on visualization and computer graphics, 28(1), 1150-1160.

Skimmed through this paper. Didn’t fully understand. I know that the purpose of this paper is to examine whether visualization can help users make causal inferences. The results show that contingency tables in text are better than visualizations in terms of helping users make causal inferences.

1. Heimerl, F., Chang, C. C., Sarikaya, A., & Gleicher, M. (2018). Visual designs for binned aggregation of multi-class scatterplots . arXiv preprint arXiv:1810.02445.

This paper presents a design space for binned aggregation methods for multi-class scatterplots. The optimal design is contingent on specific data and tasks.

## 2022-06-22 (Completed on 2022-06-24) #

Jordan, M. I. (2019). Artificial intelligence—the revolution hasn’t happened yet .

This article says that the current AI technologies are not actually AI; at least not the AI when the term was coined. The author believed that we are far from reaching the level of real artificial intelligence. He worries that when we are content with the progress we have made so far and when we only focus on solving easy problems with current “AI” technologies, we might forget the long term goals that are far away from us.

## 2022-06-21 (Completed on 2022-06-23) #

👍 Finished Bisk, Y., Holtzman, A., Thomason, J., Andreas, J., Bengio, Y., Chai, J., … & Turian, J. (2020). Experience grounds language . arXiv preprint arXiv:2004.10151.

WS1: Corpus WS2: Internet WS3: Perception WS4: Embodiment WS5: The Social World

The first stage tries to rely on limited text data. The second stage tries to collect a massive quantity of text data and have massive language models. However, these models do not know how those words are related to the real world. The third stage learns about the world through videos and audios. Models based on these perceptions, however, are not able to test hypotheses through actions in the real world. For example, if I ask “what is the feeling of putting my hands on fire?”, the model won’t be able to answer if there is no relevant data.

The fourth stage is able to allow models to translate language into actions in the real world. The model will have a mental model of the world and know the properties of objects.

The ultimate goal of learning a language is to do something for the world.

The fifth stage is about social interactions. It’s impossible to have a dataset large enough to contain all information in the world. Therefore, it’s necessary for the (machine) learner to interact with humans and participate in world events.

The authors mention that most of current NLP research falls into WS2. To move towards WS3, NLP researchers can utilize research in computer vision and speech recognition. In order to test hypotheses through actions (WS4), robotics can help. To socially interact with humans and keep learning (WS5), video games can help.

## 2022-06-20 #

Bisk, Y., Holtzman, A., Thomason, J., Andreas, J., Bengio, Y., Chai, J., … & Turian, J. (2020). Experience grounds language . arXiv preprint arXiv:2004.10151.

PP. 1-6

## 2022-06-19 #

1. Bailey, R. L., Read, G. L., Yan, Y. H., Liu, J., Makin, D. A., & Willits, D. (2021). Camera point-of-view exacerbates racial bias in viewers of police use of force videos . Journal of Communication, 71(2), 246-275.

This study finds that body-worn cameras (BWC) worsen viewers' racial prejudice against Black people.

1. Chakraborty, P., Dutta, S., & Sanyal, D. K. (2022). Personal Research Knowledge Graphs . arXiv preprint arXiv:2204.11428.

This paper talks about the possibility of personal research knowledge graphs (PRKGs). My understanding of PRKGs is that they capture the research activities, for example, research method, topic, and tools. The edges will be something like “method” and “topic”, and nodes will be specific methods, topics, or tools.

## 2022-06-18 #

1. Finished Grimmer, J., Roberts, M. E., & Stewart, B. M. (2021). Machine learning for social science: An agnostic approach . Annual Review of Political Science, 24, 395-419.

This paper reviews how social science can make use of machine learning methods, for example, classification and clustering. The authors argue that an inductive method (finding data first and then discovering patterns) might serve social science better than a deductive method, now that we have a large quantity of data.

1. Pasupat, P., Jiang, T. S., Liu, E. Z., Guu, K., & Liang, P. (2018). Mapping natural language commands to web elements . arXiv preprint arXiv:1808.09132.

The authors collected over 50K commands related to manipulating web pages and tried to find whether they were able to find the correct web element using these commands.

## 2022-06-17 #

Grimmer, J., Roberts, M. E., & Stewart, B. M. (2021). Machine learning for social science: An agnostic approach . Annual Review of Political Science, 24, 395-419.

pp. 1-12

## 2022-06-16 #

1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need . Advances in neural information processing systems, 30.

This paper introduces Transformer, a famous NLP model. Rather than using convolutional or recurrent neural networks, the Transformer uses self-attention in the computation of the representation of its inputs and outputs. I don’t totally understand what that means but that’s what they say. The transformer model is efficient since it requires less training time. It breaks the record of machine translation.

1. Fang, Y., Scott, L., Song, P., Burmeister, M., & Sen, S. (2020). Genomic prediction of depression risk and resilience under stress . Nature human behaviour, 4(1), 111-118.

This study depression polygenic risk score (MDD-PRS) is a predictor of depression. This association is stronger in the presence of stress. The participants of this study were over 5K training physicians in Europe.

## 2022-06-15 #

1. Hendricks, G., Kramer, B., Maccallum, C. J., Manghi, P., & Neylon, C. (2021). Now is the time to work together toward open infrastructures for scholarly metadata . Impact of social sciences blog.

Microsoft Academic Graph is down. The community is calling for open infrastructure to hold scholarly metadata.

1. Priem, J. (2013). Beyond the paper . Nature, 495(7442), 437-440.

This is a cool article. The author argues that the Web is changing how we communicate scientific findings. It changed how papers are written, viewed, published, and commented on. The author of this paper urges scholars to try to publish different products than merely papers, share them in new places, and “brag” them using different metrics. The author also urges scholars to think more about “Web-native production” than mere publications.

## 2022-06-14 #

He, J., Neubig, G., & Berg-Kirkpatrick, T. (2021). Efficient nearest neighbor language models . arXiv preprint arXiv:2109.04212.

I am not capable of fully understanding this paper. After listening to the talk by the first author of this paper, I got the main idea of it: nearest neighbor language models are effective but slow. This work tries to improve the efficiency of the model.

## 2022-06-13 #

Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives . IEEE transactions on pattern analysis and machine intelligence, 35(8), 1798-1828.

## 2022-06-12 #

Rule, A., Chiang, M., & Hribar, M. (Draft). Medical Scribes Have a Variable Impact on Documentation Workflows

This is a fun study. It finds that medical scribes affect both what medical staff (technicians, and doctors) document and when they document. The results also suggest that scribes without medical training may increase, rather than decrease, the workload for medical staff they work with.

## 2022-06-11 #

Roberts, A., Raffel, C., & Shazeer, N. (2020). How much knowledge can you pack into the parameters of a language model? . arXiv preprint arXiv:2002.08910.

This article shows that language models pretrained on unlabelled text are able to conduct open-domain question answering without access to any external knowledge base. The accuracy is higher than the benchmark. Also, the accuracy increases as the language model’s size grows. Although large language models are computationally intensive, they don’t need to (1) search for external knowledge, and (2) parse the external knowledge source and extract answers from it.

## 2022-06-09 #

Rule, A., Tabard, A., & Hollan, J. D. (2018, April). Exploration and explanation in computational notebooks . In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-12).

This is a very cool study. The authors analyzed how people use Jupyter Notebooks in real life from three perspectives: 1 million notebooks hosted on GitHub; 200 hand coded notebooks used in academic publications; interviewing 15 researchers with experiences of using computational notebooks.

The analysis of over 1 million notebooks show that most of them do not contain narratives but are just codes with some loose notes.

Analyses of 200 notebooks for academic use show that most only contain documentation but few contain reasoning or explanation of the results.

In the end, the authors talked about design opportunities for computational notebooks.

## 2022-06-07 #

Khandelwal, U., Levy, O., Jurafsky, D., Zettlemoyer, L., & Lewis, M. (2019). Generalization through memorization: Nearest neighbor language models . arXiv preprint arXiv:1911.00172.

I did not have enough background knowledge to fully understand this paper. That said, I got the gist of it: learning sentence sequences is an easier task than predicting the next word. The authors argue that instead of training on ever increasing datasets, we can learn representations from small datasets and later augment what has been learnt with k-nearest neighbors language models ($k$NN-LM), which is proposed in this paper, over a larger dataset.

## 2022-06-10 #

Hullman, J., Kapoor, S., Nanayakkara, P., Gelman, A., & Narayanan, A. (2022). The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning . arXiv preprint arXiv:2203.06498.

I wasn’t able to fully understand this paper. It compares the replication issues in experimental psychology and machine learning. In psychology, low power designs and over-reliance on $p$ values lead to reproducibility crisis. In ML, the split of the training/test set gives the researcher the illusion that we can indeed test the performance of the model in real life.

# 2022-05 #

## 2022-05-30 (Completed o 2022-07-25) #

I re-read: Sarikaya, A., Gleicher, M., & Szafir, D. A. (2018, June). Design factors for summary visualization in visual analytics . In Computer Graphics Forum (Vol. 37, No. 3, pp. 145-156).

## 2022-05-29 #

1. Bozarth, L., Quercia, D., Capra, L., & Scepanovic, S. (2022). The role of the Big Geographic Sort in the circulation of misinformation among US Reddit users . arXiv preprint arXiv:2205.10161.

This is a very innovative study. The authors find that (1) Reddit users are less likely to interact if they are geographically far away from each other, despite the fact that their real identity and location are not known to each other; (2) news seldomly circulated across states, possibly due to the geographically restrained interactions; (3) Reddit users' news consumption is more affected by the personality of the state this user is located compared to the platform itself; and (4) the personality of states is more related to culture than political beliefs.

1. Quercia, D., Schifanella, R., & Aiello, L. M. (2014, September). The shortest path to happiness: Recommending beautiful, quiet, and happy routes in the city . In Proceedings of the 25th ACM conference on Hypertext and social media (pp. 116-125).

This is also a very innovative study. The authors suggest that instead of suggesting the shortest path, we should consider beauty, quietness, and happiness when suggesting routes.

## 2022-05-28 (Completed on 2022-05-29) #

Bagrow, J., & Ahn, Y. (2022). Network Cards: concise, readable summaries of network data

This study proposes that we add network cards to network data. This is just like documentation.

## 2022-05-27 (Completed on 2022-05-29) #

Nummenmaa, L., Glerean, E., Hari, R., & Hietanen, J. K. (2014). Bodily maps of emotions . Proceedings of the National Academy of Sciences, 111(2), 646-651.

This is indeed a very innovative study. The authors asked participants from different cultures to indicate which parts of their body became active and less active when they experience different emotions. The results show that emotions correspond to discrete and yet overlapping areas in our body. These results may help us detect and respond to emotions.

## 2022-05-26 (Completed on 2022-05-28) #

1. Hajibabaei, A., Schiffauerova, A., & Ebadi, A. (2022). Gender-specific patterns in the artificial intelligence scientific ecosystem . Journal of Informetrics, 16(2), 101275.

The author analyzed around 40K publications about artificial intelligence published during 2000-2019. The results show an increasing trend for male-female collaborations.

1. Jiang, L., Stocco, A., Losey, D. M., Abernethy, J. A., Prat, C. S., & Rao, R. P. (2019). BrainNet: a multi-person brain-to-brain interface for direct collaboration between brains . Scientific reports, 9(1), 1-11.

This is a super cool study. The experiment involved three human participants. They played a game. Two participants are senders and one is the receiver. Senders' intentions are recorded by the EEG and then delivered to the receiver through TMS. This way, the three people communicate with each other purely through their brain signals.

## 2022-05-23 #

1. Battiston, P., Sacco, P. L., & Stanca, L. (2022). Cover effects on citations uncovered: Evidence from Nature . Journal of Informetrics, 16(2), 101293.

This is an interesting study. It finds that although Nature cover publications received significantly more citations than non-cover articles, publishing a cover article decreases citations to its authors' previous articles compared to citations to non-cover article authors' previous articles.

1. Shang, J., Zeng, M., & Zhang, G. (2022). Investigating the mentorship effect on the academic success of young scientists: An empirical study of the 985 project universities of China . Journal of Informetrics, 16(2), 101285.

I do not fully understand this article’s result but I got the gist of it. It shows that for young scientists in China, being able to get an academic title (basically an award offered by the government) is influenced by two factors: research output and the status of their mentors. The authors show that the second factor, mentors, does not have a very strong effect.

## 2022-05-22 #

1. Zhang, G., Xu, S., Sun, Y., Jiang, C., & Wang, X. (2022). Understanding the peer review endeavor in scientific publishing . Journal of Informetrics, 16(2), 101264.

This is indeed interesting work! The authors analyzed how gender, culture, English proficiency, country of origin’s economy, and the research field of researchers influence the length of their peer-review. The authors find that males, those working in humanities and social sciences, those who are proficient in English, those in developed economies, and those in non-Confucian cultures, tend to write longer reviews.

1. Wu, J., Ou, G., Liu, X., & Dong, K. (2022). How does academic education background affect top researchers’ performance? Evidence from the field of artificial intelligence . Journal of Informetrics, 16(2), 101292.

This is an interesting paper. The authors analyzed top AI researchers' educational backgrounds. Unsurprisingly, they found that most of these researchers were educated in the United States and got their highest degrees from prestigious universities.

Some of the major findings of this paper:

• More publications during study is related to higher outputs later.
• Those who graduated from prestigious universities have higher citations.
• The degrees received, i.e., BS, MS, or PhD have influences on scientific performance for people working in the industry but not for those working in academia.

## 2022-05-21 (Completed on 2022-05-22) #

Lin, Y., Evans, J. A., & Wu, L. (2022). New directions in science emerge from disconnection and discord . Journal of Informetrics, 16(1), 101234.

This study finds that novel papers, characterized by disconnected papers in their references, disrupt science. Conventional papers, characterized by well-connected papers in their references, develop science. It takes longer for novel papers to reveal its impact whereas it takes shorter for conventional papers to get many citations.

## 2022-05-20 #

This study finds that video conferencing hinders the generation of creative ideas. https://www.nature.com/articles/s41586-022-04643-y

## 2022-05-19 #

1. Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: context, process, and purpose . The American Statistician, 70(2), 129-133.

The American Statistical Association published this statement on $p$-values. It clarifies that $p$ values do not measure the probability that one argument is true and does not measure the effect size. They mention that $p$ values are oftentimes misunderstood and misused by researchers.

1. Nuzzo, R. (2014). Statistical errors . Nature, 506(7487), 150.

## 2022-05-18 #

Smaldino, P. E. (2017). Models are stupid, and we need more of them . Computational social psychology, 311-331.

This is indeed interesting work. The author argues that although models are imperfect, they are necessary. For example, Newton’s Gravity model was proven inaccurate by the General Relativity but it is nonetheless a very good approximation of reality and very helpful for humans to understand the world.

## 2022-05-17 #

1. Bhattacharya, J., & Packalen, M. (2020). Stagnation and scientific incentives (No. w26752) . National Bureau of Economic Research.

With citation being the indicator of a scientist’s success, more and more scientists are working on the Incremental Advance stage of scientific ideas, rather than the Exploration and the Breakthrough stages, as is illustrated in Figure 4. This causes scientific stagnation.

The authors came up with an “edge factor” which measures scientific novelty. Impact factor rewards work done in the breakthrough and incremental stages whereas edge factor rewards exploratory work.

The authors also mentioned how authors and journals may “play the game” of edge factor. For example, they can use synonyms of an old idea, and then their article will be regarded as containing new ideas. The authors believe that the problem can be solved, by building a comprehensive dictionary or using machine learning algorithms.

1. Shibayama, S., Yin, D., & Matsumoto, K. (2021). Measuring novelty in science with word embedding . PloS one, 16(7), e0254034.

As the title indicates, the authors used machine learning to measure the novelty of a scientific paper based on its reference lists.

## 2022-05-16 #

1. Finished Samldio & McElreath (2016) .

The authors demonstrated, through a model and simulation, why bad sciences triumph in today’s scientific culture.

1. Bhattacharya, J., & Packalen, M. (2020). Stagnation and scientific incentives (No. w26752) . National Bureau of Economic Research.

PP. 1-26

## 2022-05-15 #

Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society open science, 3(9), 160384.

PP. 1-8

## 2022-05-14 #

1. Bowman, N. D., & Keene, J. R. (2018). A layered framework for considering open science practices . Communication Research Reports, 35(4), 363-372.

This is an interesting study. The authors proposed a “layered” framework for open science practices. From the outside to the inside is this: by request, shared materials, shared analysis, shared data, and pre-registration.

I am thinking of a follow-up study: what are the status-quo of open science practices in publications in the field of communication? How many of them share data, and are pre-registered?

1. Lu, L., Liu, J., Yuan, Y. C., Lu, E., & Li, D. (2022). Psychological antecedents of COVID-19 information sharing within strong-tie and weak-tie networks . PEC innovation, 1, 100035.

This study looks at how emotions and beliefs (about the usefulness of information) are related to sharing information about COVID-19. The authors find that people who have negative emotions and strong belief that the information to be shared is helpful to prevent the disease will share the information with their strong-tie friends. For information sharing with weak-tie friends, only negative emotion is related.

## 2022-05-13 #

Petroni, F., Rocktäschel, T., Lewis, P., Bakhtin, A., Wu, Y., Miller, A. H., & Riedel, S. (2019). Language models as knowledge bases? . arXiv preprint arXiv:1909.01066.

Although I was not able to fully understand this article, I think it is an interesting study. The authors are interested in whether pre-trained language models, both unidirectional (fairseq-fconv, Transformer-XL) and directional (ELMo and BERT) store relational knowledge and how these models compare with knowledge sources (such as Google-RE and T-REx) in terms of allowing us extract that relational knowledge (for example, where was Dant born?). The results show that BERT-large performs very well in terms of enabling knowledge extraction.

## 2022-05-12 #

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning . nature, 521(7553), 436-444.

This article explains how neural networks work. Then the authors introduced convolutional neural networks and recurrent neural networks. In the end of the article, the authors talk about the future of deep learning. They believe that unsupervised learning will be more important than supervised learning in the future.

## 2022-05-11 #

1. Lu, L., Liu, J., & Yuan, Y. C. (2021). Cultural differences in cancer information acquisition: cancer risk perceptions, fatalistic beliefs, and worry as predictors of cancer information seeking and avoidance in the US and China . Health Communication, 1-10.

In this study, the authors looked at how perceived cancer risks, fatalistic beliefs, and worry about cancer are correlated with cancer information seeking and cancer information avoidance in the US and China.

They found that perceived risks, and cancer worry are correlated with increased cancer information seeking in the US and China. Cancer worry is negatively related to cancer info avoidance in the US but positively related to avoidance in China. This indicates that Chinese survey takers, worried about cancer, actively seek certain information about cancer but at the same time avoid other information.

Since this is a cross-sectional study, it is better not to use the word “predictors”.

1. Lu, L., Liu, J., Yuan, Y. C., Burns, K. S., Lu, E., & Li, D. (2021). Source trust and COVID-19 information sharing: the mediating roles of emotions and beliefs about sharing . Health Education & Behavior, 48(2), 132-139.

This is an interesting study. The authors were interested in how trust about source information on COVID-19 is related to information sharing intentions, and how this relationship is mediated by (1) belief about sharing, and (2) emotions. They found that health professionals, government, and academic institutions are more trusted than social media, family, and friends.

As for mediations, they found that high trust is related to higher intentions of sharing through the belief that sharing leads to positive outcomes. On the other hand, low trust is related to higher intentions of sharing through negative emotions (anxiety, fear, and anger).

## 2022-05-10 #

Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning . arXiv preprint arXiv:1702.08608.

This is a very influential work. The authors tried to define what interpretability is and how to measure it. Specifically, the author came up with some evaluation methods.

## 2022-05-09 #

Collison, P., & Nielsen, M. (2018). Science is getting less bang for its buck . The Atlantic.

This is a thought-provoking article. The authors were interested in whether science is slowing down and why.

They distinguished between a limited frontier and an endless one. If science is a limited frontier, that means the number of scientific areas and scientific questions we can ask is limited, and the map of science is gradually being filled. If we regard science as an endless frontier, then there are an endless number of possible questions we can answer. If we view science this way, then the slowdown of science can be attributed to the fact that we are focusing on only established fields and not exploring new areas.

The productivity in the US has been declining since the 1950s, rather than increasing. Since the 1970s, we haven’t seen many advances in major technologies, except for computers and the internet.

## 2022-05-08 #

1. Li, Y., & Bond, R. M. (2022). Evidence of the persistence and consistency of social signatures . Applied Network Science, 7(1), 1-19.

The authors find that communication networks via text and phone calls are more stable than in-person ones.

1. Hagar, N., Bandy, J., Trielli, D., Wang, Y., & Diakopoulos, N. (2020). Defining local news: a computational approach . In Computational+ Journalism Symposium 2020.

This is indeed interesting work! The authors were interested in how we can define local news. They then choose two national news outlets, two regional ones, and two local ones. They then analyzed these outlets' followers on Twitter. They especially focused on the location of these followers. They found clear differences in the cumulative percentage of followers by distances (from the outlet) among these outlets, as shown in Figure 1.

There are, of course, drawbacks of this approach. First, I am not sure why they do not plot hundreds of outlets rather than just six. Second, subscribers to news outlets are not necessarily Twitter users. Third, not all Twitter users share their location information.

1. Wang, Y., & Diakopoulos, N. (2021, May). Journalistic source discovery: Supporting the identification of news sources in user generated content . In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-18).

This is indeed interesting work! The authors were interested in what User-generated content journalists are looking for. They did interviews with professional journalists to find the answers. Then the authors build a tool to help journalists find the content they need.

## 2022-05-07 #

1. Barthelemy, R. S., Swirtz, M., Garmon, S., Simmons, E. H., Reeves, K., Falk, M. L., … & Atherton, T. J. (2022). LGBT+ physicists: Harassment, persistence, and uneven support . Physical Review Physics Education Research, 18(1), 010124.

The authors conducted an online survey with 324 LGBT+ physicists. They found that 22% of them experienced exclusive behavior in the past year and 36% of them are considering leaving this field. The authors also find that transgender physists are more likely to experience exclusive behavior; 49% of them exeprienced that in the past year.

1. Sparks, K., Moehl, J., Weber, E., Brelsford, C., & Rose, A. (2022). Shifting temporal dynamics of human mobility in the United States . Journal of Transport Geography, 99, 103295.

This study examines how COVID-19 impacts human mobility patterns. They find that during the pandemic, in the USA, people’s morning activities started later and evening activities started earlier. Also, behavioral patterns in the weekdays are more similar to those in weekends.

1. Wu, L., Kittur, A., Youn, H., Milojević, S., Leahey, E., Fiore, S. M., & Ahn, Y. Y. (2022). Metrics and mechanisms: Measuring the unmeasurable in the science of science . Journal of Informetrics, 16(2), 101290.

This is an interesting study. The authors propose that we can measure science in these three categories: hot vs cold science, hard vs soft science, and fast vs slow science. The authors also talked about how to measure those metrics.

## 2022-05-06 #

1. Abhari, R., Vincent, N., Dambanemuya, H. K., Bodon, H., & Horvát, E. Á. (2022). Twitter Engagement with Retracted Articles: Who, When, and How?. arXiv preprint arXiv:2203.04228.

The authors analyzed Twitter discussions on retracted papers. They find that (1) retracted papers receive more discussion than normal papers, especially among public users and social bots; and (2) most of the discussions occur before retraction.

1. Vásárhelyi, O., Zakhlebin, I., Milojević, S., & Horvát, E. Á. (2021). Gender inequities in the online dissemination of scholars’ work. Proceedings of the National Academy of Sciences, 118(39).

This is interesting work. The authors find that female scientists' work is less mentioned compared to men’s scientific work. In addition, the authors find that for men, their prior scientific impact and collaboration networks are associated with higher online visibility; For women, however, there are no clear indicators of visibility.

The authors shared these valuable resources:

## 2022-05-05 #

1. Rakita, D., Mutlu, B., & Gleicher, M. (2017, March). A motion retargeting method for effective mimicry-based teleoperation of robot arms . In Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (pp. 361-370).

The authors created an interface where a user can control a robot arm by moving their hand (and the robot arm will mimic the movement).

1. Szafir, D. A., Haroz, S., Gleicher, M., & Franconeri, S. (2016). Four types of ensemble coding in data visualizations . Journal of vision, 16(5), 11-11.

The authors proposed that there are four types of ensemble tasks for visualizations: summary, identification, pattern recognition, and segmentation. They also argued that there exist some unanswered questions in each of the tasks.

## 2022-05-04 #

1. Finished Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl 1), 5228-5235.

This is indeed great work. I tried to understand the details of this article but gave it up. It is far beyond my capabilities now. But I will revisit this paper.

This paper proposed a statistical LDA method to find scientific topics and applied this method to identify topics by analyzing PNAS abstracts. The authors identified 300 topics and analyzed hot and cold topics.

1. Tang, S., Zhang, X., Cryan, J., Metzger, M. J., Zheng, H., & Zhao, B. Y. (2017). Gender bias in the job market: A longitudinal analysis. Proceedings of the ACM on Human-Computer Interaction, 1(CSCW), 1-19.

This is indeed innovative work. The authors analyzed 17 million job listings on LinkedIn posted between 2005 and 2016. They found that gender biases in the wording of job listings have been declining over the years. They also changed the wording for some listings to remove gender biases and tested whether these changes lead to more applications from applicants who would not apply otherwise. They found that the effects of word changes are limited; smaller than participants' preconceived biases about job types. For example, even if a job related to technology is devoid of gender bias words, female participants may not want to apply.

## 2022-05-03 #

Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl 1), 5228-5235.

PP. 1-2

## 2022-05-02 #

I read some news on Phys.org and also at https://www.science.org/news/all-news

## 2022-05-01 (Completed on 2022-05-02) #

1. Lee, C. Y. P., Zhang, Z., Herskovitz, J., Seo, J., & Guo, A. (2021, October). CollabAlly: Accessible Collaboration Awareness in Document Editing. In The 23rd International ACM SIGACCESS Conference on Computers and Accessibility (pp. 1-4).

The authors created CollabAlly, a tool that helps blind people collaboratively edit documents.

1. Pinho-Gomes, A. C., Peters, S., Thompson, K., Hockham, C., Ripullone, K., Woodward, M., & Carcel, C. (2020). Where are the women? Gender inequalities in COVID-19 research authorship . BMJ Global Health, 5(7), e002922.

The authors analyzed author genders in COVID-19 papers. They found that women accounted for 1/3 of all authors.

1. Cryan, J., Tang, S., Zhang, X., Metzger, M., Zheng, H., & Zhao, B. Y. (2020, April). Detecting gender stereotypes: lexicon vs. supervised learning methods . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-11).

This paper finds that in terms of detecting gender biases, supervised learning models by fine tuning BERT is more accurate and robust than the traditional lexicon method.

# 2022-04 #

## 2022-04-30 (Completed on 2022-05-02) #

1. Kim, H., Rossi, R., Du, F., Koh, E., Guo, S., Hullman, J., & Hoffswell, J. (2022). Cicero: A declarative grammar for responsive visualization . CHI2022.

This is a cool paper. The authors invented a declarative grammar to create responsive visualizations.

1. Chen, K., Jeon, J., & Zhou, Y. (2021). A critical appraisal of diversity in digital knowledge production: Segregated inclusion on YouTube . New Media & Society, 14614448211034846.

The authors analyzed YouTube channels that are related to science. They analyzed these channels' profile diversity and citation network. They found that those profiles are diverse and that a few videos attracted most of the citations.

1. Rakita, D., Mutlu, B., & Gleicher, M. (2018, March). An autonomous dynamic camera method for effective remote teleoperation. In 2018 13th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (pp. 325-333). IEEE.

The authors proposed a method that facilitates remote teleoperation of a robot by using a camera-in-hand robot.

1. Ekdale, B., Rinaldi, A., Ashfaquzzaman, M., Khanjani, M., Matanji, F., Stoldt, R., & Tully, M. (2022). Geographic Disparities in Knowledge Production: A Big Data Analysis of Peer-Reviewed Communication Publications from 1990 to 2019 . International Journal of Communication, 16, 28.

This paper analyzed communication scholars from a geographic standpoint. They found that the proportion of scholars from North America and Europe has been declining and that scholars from North America and Europe are overrepresented in high prestige journals.

## 2022-04-29 (Completed on 2022-05-02) #

1. Wang, Y., Jung, C., Wang, R., & Kim, Y. S. (2022). What makes web data tables accessible? Insights and a tool for rendering accessible tables for people with visual impairments.

This study first looks at why a table is inaccessible to blind people. The authors then made a Chrome extension to convert inaccessible HTML tables to accessible ones for blind users.

1. Kim, H., Rossi, R., Sarma, A., Moritz, D., & Hullman, J. (2021). An Automated Approach to Reasoning About Task-Oriented Insights in Responsive Visualization. IEEE transactions on visualization and computer graphics, 28(1), 129-139.

This is a cool paper. The authors used machine learning to create a recommender system which recommends responsive visualizations based on visualizations on large screens.

## 2022-04-28 (Completed on 2022-05-02) #

• Do an internship during Summer.
• Go to conferences and talk with people.

## 2020-04-27 (Completed on 2022-05-02) #

• Help other graduate students. And don’t compare yourself with others.

• Improve your work efficiency rather than work for longer hours.

• Your grades in classes do not equate with how good you are.

• Try to do an internship.

• Have a break or a vacation when you need to.

## 2022-04-26 (Completed on 2022-05-01) #

45:00-end

• Companies that have more diversity have better performance

• We all have biases. To acknowledge it is important because if we do not acknowledge it, we cannot do anything about it.

• We form a first impression on people we meet. If we have a good impression on them, then it’s easy for us to like them.

• Small changes can make a big difference.

• Not being part of the problem doesn’t mean you are actively part of the solution.

• Oftentimes we are unaware of the biases we have in our mind.

### Four types of biases we can change #

• Performance bias. Male’s performance is often overestimated. Race also plays a role here. White people might get hired or promoted because of their potential but people of color will get hired or promoted based on the work they have achieved and ability demonstrated.

• Performance attribution. If we see a man who is successful, we might think it is because of his brilliance but if a woman is successful, we might attribute her success to luck and hard work.

• Competence vs. likeability tradeoff: if a woman is very competent, then we might think she is selfish and not likable but men do not have this tradeoff.

• We think that women should do the office housework. To counteract it, we should assign people to do the housework rather than letting people volunteer (because women will be more likely to volunteer).
• Maternal bias: we believe strongly that mothers cannot be good employees.

• Don’t make assumptions that women cannot be good employees when they become mothers. Instead, have direct conversations with them and plan ahead.

### What we can do to counteract #

• For a company, it should set decision-making criteria in the beginning.

• More:

23:00-45:00

0:00-23:00

## 2022-04-23 (Completed on 2022-04-30) #

Wu, A., Tong, W., Dwyer, T., Lee, B., Isenberg, P., & Qu, H. (2020). Mobilevisfixer: Tailoring web visualizations for mobile phones leveraging an explainable reinforcement learning framework. IEEE Transactions on Visualization and Computer Graphics, 27(2), 464-474.

This is a very cool project. The authors used reinforcement learning to convert visualizations in a way that they are adapted to mobile screens.

## 2022-04-22 (Completed on 2022-04-29) #

1. A draft shared by a friend.

1. Guan, L., Liang, H., & Zhu, J. J. (2022). Predicting reposting latency of news content in social media: A focus on issue attention, temporal usage pattern, and information redundancy . Computers in Human Behavior, 127, 107080.

This is an interesting study. The authors looked at what factors influenced reposting speed on Twitter. The authors find that reposting on Twitter is highly skewed, with half of the posts examined reposted in half an hour and the maximum gap between first appearance and reposting being over 70k hours. Second, the author found that Twitter users who cared about multiple issues are slower in reposting Tweets.

## 2022-04-21 (Completed on 2022-04-28) #

1. Gleicher, M., Yu, X., & Chen, Y. (2022). Trinary tools for continuously valued binary classifiers . Visual Informatics.

This study looks at how to visually communicate the continuous values of binary classification.

1. A draft shared by a friend that is under review.

This is indeed innovative work!

This study looks at what questions are common tasks when blind people try to understand data charts. The authors then come up with a taxonomy for a “chart QA system” where blind people ask the machines about what they want to know from the chart.

## 2022-04-20 (Completed on 2022-04-27) #

Philip Guo’s newsletter of April 2022, on his 10-year anniversary of PhD dissertation defense.

• PhD oral defense is not really a serious thing. The fact that the committee agreed that you can do the defense means they won’t fail you. That said, you probably will still be nervous until it is officially over.

• To graduate in 5 years, it is better to have 2-3 papers published in legitimate venues with you being the first author and your advisor as the last author.

• Do more than simply stapling your N papers to be your PhD dissertation. The reason is that you have been working in your subfield for so many years and you have many insights that were not part of the individual papers that you have published. You can describe the big picture you have in mind in your dissertation. That will be helpful both to you and to scientists of future generations. If you don’t write your insights down, you’ll most likely forget all of them after you graduate.

• Schedule your defense at least two months beforehand. It’s so difficult to find a common slot that everyone has time considering that professors are super busy.

## 2022-04-19 (Completed on 2022-04-26) #

1. Roberts, M. E., Stewart, B. M., & Tingley, D. (2019). Stm: An R package for structural topic models. Journal of Statistical Software, 91, 1-40.

This article talks about how the stm package was designed and how to use it (with an example).

1. Esfahani, H., Tavasoli, K., & Jabbarzadeh, A. (2019). Big data and social media: A scientometrics analysis. International Journal of Data and Network Science, 3(3), 145-164.

This is a typical scientometric analysis. The authors analyzed publications on big data and social media. They analyzed authors, citations, and keywords co-occurrence.

1. Eble, A., & Hu, F. (2022). Gendered beliefs about mathematics ability transmit across generations through children’s peers. Nature Human Behaviour, 1-12.

This study shows that the belief that girls are weaker at math than boys transmitted across generations via children’s peers.

## 2022-04-18 (Completed on 2022-04-25) #

1. Pooley, J., & Katz, E. (2008). Further notes on why American sociology abandoned mass communication research. Journal of Communication, 58(4), 767-786.

The paper, from a historical perspective, talks about why Communication research and Sociology research in the US are separated from each other. THe reason is that Sociology researchers who were interested in media research went to Journalism school. Sociology researchers who remained in the Sociology department did not continue doing media research.

1. Zhang, R., E. Ringland, K., Paan, M., C. Mohr, D., & Reddy, M. (2021, May). Designing for Emotional Well-being: Integrating Persuasion and Customization into Mental Health Technologies. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-13).

This study finds that for mental health apps, customization functions that are not burdensome might be rewarding. If those functions are burdensome, then people with severe depression and/or anxiety symptoms will find them difficult to use.

## 2022-04-17 (Completed on 2022-04-24) #

This study finds two trends: (1) an increasing number of scientific papers are getting cited at least a few times, and (2) citations are concentrated on only top papers. https://www.pnas.org/doi/10.1073/pnas.2117488119

## 2022-04-16 (Completed on 2022-04-23) #

In this article, the author talked about how Japanese identify themselves. They used to think that Japanese people are the chosen one. That changed when it was defeated in the Second World War and its economy stopped growing fast in the 1990s.

1. I read some other random articles that interest me on Economists.

## 2022-04-15 (Completed on 2022-04-22) #

The article argues that doing a PhD is a waste of time. First, there are more PhDs than academia wants. The number of full-time tenure track professor openings is much smaller than the number of newly minted PhDs. Also, PhDs are paid badly. The author argues that PhD students can use their time to do more meaningful things.

I do not totally agree with the point that PhD is a waste of time. At least for me, as a PhD in Computer Science, I do not think time invested in my study is a waste of time. I see a clear relationship between my expertise and the salary I can make after I graduate. For other majors, there is some truth that time invested in PhD study is not worth it.

One point I do agree with is that which is hinted at the end of the article: PhD students are good at study and have been the best at what they have done. However, they know little about the “real world”. This is very inspiring for me. I kind of understand that even if I feel inferior attending classes (because I am not the best in class), I should not be despised because the people around me are not a representative sample of the whole population. Too often, we PhD students are easily drowned in study and research and we forget about the fact that knowledge is only part of our life; not the whole of it.

## 2022-04-14 (Completed on 2022-04-20) #

1. Gleicher, M. (2017). Considerations for visualizing comparison . IEEE transactions on visualization and computer graphics, 24(1), 413-423.

This is a highly influential work.

The author argues that there are four considerations in a comparison task:

1. identify elements to compare
• identify targets (what to compare)
• actions on relationships
2. know what the challenges are in comparison
3. decide on the strategies to be used for comparison.
4. creat the design for comparison

There are three reasons why comparisons might become difficult:

• number of items to compare
• complexities in each item
• complexities in the relationships between items

There are three strategies that can be used for a comparison task:

• scan sequentially
• subset
• summarize (two things to consider:)
• how to create the summary
• how to present it

The author mentioned that there are three basic designs for visual comparisons: juxtaposition, superpsotion, and explicit encoding.

The author also mentioned that summarization can be done in two orders: First find the relationships between objects and summarize these relationships; or first summarize items and compare these summarizations.

What is the relationship between the three basic designs for visual comparisons and the three strategies for a comparison task?

1. Sarikaya, A., Gleicher, M., & Szafir, D. A. (2018, June). Design factors for summary visualization in visual analytics . In Computer Graphics Forum (Vol. 37, No. 3, pp. 145-156).

This is an innovative study. The author came up with a taxonomy of data summarization and examined 1) whether this taxonomy covers (randomly selected) visualizations in the visualization field, and 2) how these taxonomies interact.. They think that there are four major methods of data summarization: aggregation, subsampling, filtering, and projection. When choosing from these four methods, a designer can consider these factors: purpose, task, and data. Therefore, there are four factors of summarization in visualization: methods, purpose, tasks, and data types.

The authors then did a content analysis on randomly selected publications. They wanted to study whether the taxonomy proposed in this paper are well represented by the randomly selected publications, and what are the proportions (for example, how many papers used aggregation method). They also examined how these four factors interact. For example, in visualizations that used the summarization method of aggregation, what are the tasks and data types in these visualizations.

The interactive system affiliated with this paper is super cool: https://graphics.cs.wisc.edu/Vis/vis_summaries/ .

## 2022-04-13 (Completed on 2022-04-19) #

1. Wang, H. (2021). Generational Change in Chinese Journalism: Developing Mannheim’s Theory of Generations for Contemporary Social Conditions. Journal of Communication, 71(1), 104-128.

This study analyzed changes in Chinese journalism by interviewing over 100 journalists.

1. Petr, M., Engels, T. C., Kulczycki, E., Dušková, M., Guns, R., Sieberová, M., & Sivertsen, G. (2021). Journal article publishing in the social sciences and humanities: A comparison of Web of Science coverage for five European countries. PloS one, 16(4), e0249879.

This study looks at where research articles (in social science and humanities) by scholars from the Czech Republic, Slovakia, Poland, Belgium, and Norway, were published. The authors of this article find that an increasing number of those articles are published in journals indexed by the Web of Science, indicating that they are of high quality.

## 2022-04-12 (Completed on 2022-04-18) #

1. Huth, A. G., De Heer, W. A., Griffiths, T. L., Theunissen, F. E., & Gallant, J. L. (2016). Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600), 453-458.

This paper is too technical for me so I do not understand the details. But I got the basic idea: the authors created a semantic atlas where it is clear which brain areas correspond to which semantic groups. That is to say, using this atlas, we know the corresponding brain areas to specific words.

## 2022-04-11 (Completed on 2022-04-18) #

1. Chen, Y. T., Smith, A. D., Reinecke, K., & To, A. (2022). Collecting and Reporting Race and Ethnicity Data in HCI .

The authors analyzed reporting of the race and ethnicity of participants in CHI publications from 2016 to 2021. They found that reporting race and ethnicity is uncommon in CHI publications (they analyzed 3910 papers but only 340 contained race reporting). I am not sure whether they can claim like this because we do not know the baseline of papers that involve participants. For example, what if only 1/5 of all 3910 papers involve human participants? The rest 4/5 do not need to report race because there are no human participants.

1. Wang, L. L., Stanovsky, G., Weihs, L., & Etzioni, O. (2021). Gender trends in computer science authorship. Communications of the ACM, 64(3), 78-84.

The authors analyzed authors' gender in CS publications. They found that gender parity is not predicted to be reached until 2100. I am not sure whether the results are reliable. I believe Gender API is not able to predict gender for Asian names very accurately.

1. Wang, L. L., Lo, K., Chandrasekhar, Y., Reas, R., Yang, J., Eide, D., … & Kohlmeier, S. (2020). Cord-19: The covid-19 open research dataset . ArXiv.

This paper details how CORD-19 was created, and how CORD-19 has been used in research so far.

## 2022-04-10 (Completed on 2022-04-16) #

1. Zhao, R., & Wang, J. (2011). Visualizing the research on pervasive and ubiquitous computing. Scientometrics, 86(3), 593-612.

The authors analyzed over 5,000 publications accessed from Web of Science on the topic of pervasive or ubiquitous computing. They then conducted a typical scientometric analysis using CiteSpace.

1. Yin, Y., Dong, Y., Wang, K., Wang, D., & Jones, B. (2021). Science as a Public Good: Public Use and Funding of Science (No. w28748). National Bureau of Economic Research.

This is a very interesting study. The authors analyzed the alignment among what scientists use, what the public use, and what is funded. They found that the alignment is greater than previously thought.

1. Pop, M., & Salzberg, S. L. (2015). Use and mis-use of supplementary material in science publications . BMC bioinformatics, 16(1), 1-4.

This is important work. The authors argued that supplementary material is being abused by scientists. We needed supplementary material in the past because of the page limit (since we were printing our papers). Nowadays, however, people mostly read papers in electronic rather than printed form. Therefore, it does not make much sense to still use supplementary material. The problems with supplementary material are that 1) they are not fully reviewed, and therefore their quality is not guaranteed; 2) they are not very well integrated into the full text of the paper and therefore are difficult to use; and 3) references in supplementary material are not indexed as citations, making the works referenced un-noticed, which is unfair for scientists whose works were cited in supplementary material.

## 2022-04-09 (Completed on 2022-04-14) #

Herbst, S. (2008). Disciplines, intersections, and the future of communication research. Journal of Communication, 58(4), 603-614.

In this article, the author talks about the interdisciplinary-ness of Communication research and also about what communication scholars should do in order to make the field more influential. One suggestion is that Communication scholars need to publish in non-communication journals.

## 2022-04-08 (Completed on 2022-04-12) #

1. Edelmann, A., Wolff, T., Montagne, D., & Bail, C. A. (2020). Computational social science and sociology . Annual Review of Sociology, 46, 61-81.

The authors identified around 400 papers of computational social science and analyzed their core topics.

1. Liu, Y., Goncalves, J., Ferreira, D., Xiao, B., Hosio, S., & Kostakos, V. (2014, April). CHI 1994-2013: Mapping two decades of intellectual progress through co-word analysis . In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 3553-3562).

This is a very innovative study. The authors analyzed the keywords of all CHI papers from 1994 to 2013. They find that HCI is a diverse field where many different topics are related to each other. Based on the results, the authors did not think it is a good idea to break CHI into several sub-conferences partly because having many different topics together can spawn unpredicted new research topics.

1. Else, H. (2021). ‘Tortured phrases’ give away fabricated research papers . Nature, 596 (7872), 328-329.

To bypass plagiarism check, some papers used reverse translation. For example, big data becomes “colossal information”. This is a new type of fabricated publications.

## 2022-04-07 (Completed on 2022-04-12) #

1. Padilla, S., Methven, T. S., Corne, D. W., & Chantler, M. J. (2014). Hot topics in CHI: trend maps for visualising research. In CHI'14 extended abstracts on human factors in computing systems (pp. 815-824).

The authors parsed the PDFs of all CHI papers from 2009 to 2013 into raw text and conducted topic modeling. They then categorized these topics into three groups based on their popularity trends: hot, cold, stable. The output is a one-page presentation showing all these topics.

1. Hu, Y., Feng, L., Mutlu, B., & Admoni, H. (2021, June). Exploring the Role of Social Robot Behaviors in a Creative Activity. In Designing Interactive Systems Conference 2021 (pp. 1380-1389).

This study examines how a robot’s behavior affects users' creativity activity and how they perceive the robot. The authors find that playfulness of the robot is important.

## 2022-04-06 (Completed on 2022-04-11) #

1. You, T., Park, J., Lee, J. Y., Yun, J., & Jung, W. S. (2021). Disturbance of greedy publishing to academia . arXiv preprint arXiv:2106.15166.

This is an innovative study. The authors analyzed how predatory journals make their impact factors look better than they really are. They also examined the impact of publications in predatory journals. They found that those questionable publications were not as impactful as normal publications.

1. Beall, J. (2012). Predatory publishers are corrupting open access . Nature, 489(7415), 179-179.

In this article, the author of Beall’s list expressed his concern over predatory publishing.

## 2022-04-05 (Completed on 2022-04-10) #

1. Wadden, D., Lin, S., Lo, K., Wang, L. L., van Zuylen, M., Cohan, A., & Hajishirzi, H. (2020). Fact or fiction: Verifying scientific claims . arXiv preprint arXiv:2004.14974.

This is a cool project. The authors built a model that can automatically check whether a claim in a scientific study is supported by evidence or not.

1. Lo, K., Wang, L. L., Neumann, M., Kinney, R., & Weld, D. S. (2019). S2ORC: The semantic scholar open research corpus. arXiv preprint arXiv:1911.02782.

This is super cool research. The authors built the largest English language academic corpus based on full texts of 8.1 million papers. This is indeed a formidable task but they managed to complete it. The pipeline is also used to build a COVID-19 research corpus (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7251955/).

1. Waisbord, S. (2015). My vision for the Journal of Communication. Journal of Communication, 65(4), 585-588.

Dr. Waisbord served as the editor in Chief of Journal of Communication. In this article, he shared his vision for the journal. He mentioned that Communication as a field is fragmented and he called for internationalization of the journal and the field.

## 2022-04-04 (Completed on 2022-04-09) #

1. Correia, A., Jameel, S., Schneider, D., Fonseca, B., & Paredes, H. (2019, May). The effect of scientific collaboration on CSCW research: A scientometric study . In 2019 IEEE 23rd International Conference on Computer Supported Cooperative Work in Design (CSCWD) (pp. 129-134). IEEE.

This study looks at collaborations in CSCW research from three aspects: local level, national level, and international level.

1. Wang, X., Song, Y., & Su, Y. (2022). Less Fragmented but Highly Centralized: A Bibliometric Analysis of Research in Computational Social Science . Social Science Computer Review, 08944393211058112.

This is exciting work. The authors found more than 7000 papers belonging to the field of computational social science based on the Web of Science database and analyzed the co-subject network. They analyzed this network using four network structures: 1. free-scale, 2. plural-island, 3. small-world, and 4. a random model. They found that this network is centralized but not very fragmented as we thought.

After reading this paper, I have the following follow-ups:

1. How about analyzing the citation network (both references and citations)?
2. How about analyzing the co-author network? Right now, they only studied co-subject networks. But I don’t think this is enough. We need to examine the interdisciplinary-ness from the aspect of co-authors' disciplines.
3. How about analyzing the co-subject network using the tags on Google Scholar? That might be more reflective of the disciplinary makeup of CSS authors.

## 2022-04-03 (Completed on 2022-04-08) #

1. The author found that China (Mainland China + Hong Kong SAR) has already overtaken the US in terms of the total number of publications on artificial intelligence. The quality is set to overtake US in less than a decade. https://blog.allenai.org/china-to-overtake-us-in-ai-research-8b6b1fe30595. I am very doubtful of this result, though.

2. Dong, Y., Ma, H., Shen, Z., & Wang, K. (2017, August). A century of science: Globalization of scientific collaborations, citations, and innovations. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1437-1446).

The title of this paper explains its content very well. They analyzed over 100 million scientific papers between 1990 and 2015. They analyze authors, collaborations, and citations.

1. Sinha, A., Shen, Z., Song, Y., Ma, H., Eide, D., Hsu, B. J., & Wang, K. (2015, May). An overview of microsoft academic service (mas) and applications. In Proceedings of the 24th international conference on world wide web (pp. 243-246).

This paper is about how Microsoft Academic Graph identifies papers, authors, venues, institutions, and fields of study. It also talked about the applications of MAG.

## 2022-04-02 (Completed on 2022-04-07) #

1. Bucchi, M., & Trench, B. (2017). Science communication and science in society: a conceptual review in ten keywords. TECNOSCIENZA: Italian Journal of Science & Technology Studies, 7(2), 151-168.

The authors reviewed around 70 publications in the field of science communication through the lens of ten key words they selected.

1. Arkhipov, D. (1999). Scientometric analysis of Nature, the journal. Scientometrics, 46(1), 51-72.

1. Goerlandt, F., Li, J., & Reniers, G. (2020). The landscape of risk communication research: A scientometric analysis. International journal of environmental research and public health, 17(9), 3255.

This study looks at the field of risk communication and presents a complete scientometric analysis on it.

1. Coursaris, C. K., & Van Osch, W. (2014). A scientometric analysis of social media research (2004–2011). Scientometrics, 101(1), 357-380.

This study analyzed around 600 papers on social media and conducted a scientometric analysis. I was surprised that they chose ProQuest as the platform where they obtained social media research articles and that they only studied 600 papers. I didn’t know how this study could be published.

1. Purnomo, A., Sari, Y. K. P., Firdaus, M., Anam, F., & Royidah, E. (2020, August). Digital literacy research: A scientometric mapping over the past 22 years. In 2020 International Conference on Information Management and Technology (ICIMTech) (pp. 108-113). IEEE.

The authors did a scientometric analysis of research on digital literacy.

## 2022-04-01 (Completed on 2022-04-06) #

1. Wang, L. L., Mack, K., McDonnell, E. J., Jain, D., Findlater, L., & Froehlich, J. E. (2021, May). A bibliometric analysis of citation diversity in accessibility and HCI research. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-7).

This study focused on accessibility research (specifically, 836 papers from ASSETS and CHI). It examined fields of study of references (i.e., those referenced in these 836 papers) and citing papers (i.e., those citing these 836 papers).

1. Mack, K., McDonnell, E., Jain, D., Lu Wang, L., E. Froehlich, J., & Findlater, L. (2021, May). What Do We Mean by “Accessibility Research”? A Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-18).

This is amazing work. The authors analyzed 836 papers on accessibility published in ASSETS and CHI. They analyzed the methods, participants, foci, etc. in these papers.

1. Kaye, J. J. (2009). Some statistical analyses of CHI. In CHI'09 extended abstracts on human factors in computing systems (pp. 2585-2594).

This paper did some random analyses on CHI authors. Specifically, number of authors, genders, and repeated authorship.

# 2022-03 #

## 2022-03-31 (Completed on 2022-04-05) #

1. This paper examines possible labor abuse in fishing and illegal, unreported, and unregulated fishing vessels around the world. They found that higher risks of human labor abuse were correlated with poor control of corruption in a country and also with Chinese flagged vessels. https://www.nature.com/articles/s41467-022-28916-2.pdf

2. Plastic pollution is pervasive in the Arctic. https://www.nature.com/articles/s43017-022-00279-8

3. Syropoulos, S., Lifshin, U., Greenberg, J., Horner, D. E., & Leidner, B. (2022). Bigotry and the human–animal divide:(Dis) belief in human evolution and bigoted attitudes across different cultures. Journal of Personality and Social Psychology.

Low belief in human evolution is associated with higher levels of racism and prejudice.

## 2022-03-30 (Completed on 2022-04-04) #

1. Pohl, H., & Mottelson, A. (2019, May). How we guide, write, and cite at Chi. In Extended abstracts of the 2019 CHI conference on human factors in computing systems (pp. 1-11).

This is indeed a very interesting paper. The authors analyzed how CHI papers' readability, titles, novelty, and mentioning of famous scholars impacted citation counts. One pity is that the authors did not make their data and codes. Otherwise, they will be enormously useful for future researchers.

1. Heffner, J., & FeldmanHall, O. (2022). A probabilistic map of emotional experiences during competitive social interactions. Nature Communications, 13(1), 1-11.

In this study, the authors did not give an emotion a specific name but rather created a probabilistic map of emotions. They wanted to look at how emotions are linked to the type of social choices. They found that “punitive and uncooperative choices” are linked to sadness and disappointment but very weakly related to anger. This is contradictory to conventional belief.

## 2022-03-29 (Completed on 2022-04-03) #

1. Kaplan, R. D. (2005). How we would fight China. The Atlantic Monthly, 295(5), 49-64.

1. Wang, D., Pedreschi, D., Song, C., Giannotti, F., & Barabasi, A. L. (2011, August). Human mobility, social ties, and link prediction. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1100-1108).

This article finds that people’s similarity in physical movements is strongly correlated with their proximity in social networks. They further showed that mobility patterns have the predictive power for predicting a social link.

## 2022-03-28 (Completed on 2022-04-03) #

1. This article talks about the history and the status quo of AI research in China. https://dl.acm.org/doi/10.1145/3239540

## 2022-03-27 (Completed on 2022-04-01) #

Kim, M. C., Zhu, Y., & Chen, C. (2016). How are they different? A quantitative domain comparison of information visualization and data visualization (2000–2014). Scientometrics, 107(1), 123-165.

This study compares information visualization and data visualization from the perspective of keywords and citation trends. They found that the two fields co-evolved while at the same time had different foci.

## 2022-03-26 (Completed on 2022-04-01) #

1. Haroz, S. (2018, October). Open practices in visualization research: Opinion paper. In 2018 IEEE Evaluation and Beyond-Methodological Approaches for Visualization (BELIV) (pp. 46-52). IEEE.

The author checked the open science practice at IEEE VIS 2017. He then proposed recommended methods to practice open science.

1. Wacharamanotham, C., Eisenring, L., Haroz, S., & Echtler, F. (2020, April). Transparency of CHI Research Artifacts: Results of a Self-Reported Survey. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-14).

The authors asked HCI researchers whether, how, and where they shared their data. They recommended sharing data and codes on OSF or Zenodo rather than on GitHub.

## 2022-03-25 (Completed on 2022-03-29) #

Browsed through lots of articles titles on Nature but did not go into details of any.

## 2022-03-24 (Completed on 2022-03-28) #

1. This study looks at the past progress made around the globe in reaching the Sustainable Development Goals (SDGs) and projects future progress. They looked at whether the globe can reach universal primary, secondary and tertiary education. They also looked at the gender gap. They found that the gender gap in education has been closed in 2018 with some countries in Africa and Middle East having high gender gaps. They also estimate that by 2030, in eighteen countries females will have more education than males. https://www.nature.com/articles/s41586-020-2198-8

2. This is an interesting study. The authors found that globally, climate has a stronger impact on language diversity than landscape features. https://www.nature.com/articles/s41467-019-09842-2

3. This is also a very interesting study. The authors used OECD’s PISA dataset and found that although females had lower scores than males on math and science (and higher scores on reading tests), they were able to sustain their performance across all three subjects. Therefore, if a math or science test is longer, it will decrease the gender gap in scores. https://www.nature.com/articles/s41467-019-11691-y

## 2022-03-23 (Completed on 2022-03-27) #

1. This study finds that China is well-positioned to reach its emission-reducing goals as declared in Paris in June 2015. https://www.nature.com/articles/s41467-019-09159-0.pdf

2. This study finds that scientific topics associated with scientific prizes experienced increased growth. https://www.nature.com/articles/s41467-021-25712-2.pdf

## 2022-03-22 (Completed on 2022-03-26) #

1. The authors found that co-authoring a paper with a top scientist predicts later success in academia. They were focusing on physics and life sciences. https://www.nature.com/articles/s41467-019-13130-4

2. This is a very interesting study. The authors compared emotions and readbility between IPCC reports for policymakers and scientific and media coverage on findings of these reports. The authors found that IPCC reports were low in both metrics. https://www.nature.com/articles/nclimate2824

## 2022-03-21 (Completed on 2022-03-25) #

1. This study used data from the world bank which is about learning in countries across the globe. The data shows that learning progress made is only modest. Also, the authors show that learning score is more associated with economic growth than other variables used elsewhere, such as Penn world tables and UN’s human development Index.

I really like this study not because it has eye-opening results or ground-breaking methods, but because it shows to me that using publicly available datasets can also allow people to publish on Nature.

By the way,

The above list is partially based on freeCodeCamp

## 2022-03-20 (Completed on 2022-03-24) #

### Phys.org (2022-03-24) #

• The Hollywood Diversity Report examines the diversity (in terms of race and gender) in Hollywood movie actors, directors, writers, and consumers. They found that people of color are a salient part in all these roles.
• Future telescopes might be able to identify technosignature directly .
• Nuclear weapons are real and if used, even on a small scale, could end up killing billions of people. https://phys.org/news/2022-03-russia-invading-ukraine-threat-nuclear.html

## 2022-03-19 (Completed on 2022-03-23) #

I browsed through articles on https://www.nature.com/search?article_type=protocols,research,reviews&subject=social-sciences. I only very briefly read through the titles and summaries without going deep into every article. The problem is that I forgot what I read now. So I believe it is a good idea to always summarize what I’ve read rather than jumping to the next article directly.

## 2022-03-18 (Completed on 2022-03-22) #

I read the latest scientific news at https://phys.org/latest-news/. I feel that climate change, water scarcity, energy shortage, poverty, and pollution are among the most urgent issues around the globe.

## 2022-03-17 (Completed on 2022-03-21) #

I read two articles on The Atlantic

1. What Happened to Hong Kong?
2. Only NATO can save Putin

## 2022-03-16 (Completed on 2022-03-20) #

I briefly browsed through articles on pnas.org.

## 2022-03-15 (Completed on 2022-03-19) #

I read daily scientific news on phys.org and some papers in Nature.

## 2022-03-14 (Completed on 2022-03-18) #

Langrock, I., & González-Bailón, S. (2020). The Gender Divide in Wikipedia: Quantifying and Assessing the Impact of Two Feminist Interventions. Available at SSRN 3739176.

This papers looks at the results of two feminists interventions aimed to counteract gender divide on Wikepedia. The gender divide on Wikipedia refers to the fact that there is more content about men than women. This study finds that the two interventions were successful at adding content about women but were not successful at reducing structural biases that limit the visibility of that specific content.

## 2022-03-13 (Completed on 2022-03-17) #

The authors used a mathematical model to study people’s attention given to different kinds of digital content. They found that the increasing ups and downs of attention to content are because of increasing creation and consumption of digital content. Now, topics can only receive shorter periods of attention from people. https://doi.org/10.1038/s41467-019-09311-w

## 2022-03-12 (Completed on 2022-03-16) #

1. Choi, J., Jung, S., Park, D. G., Choo, J., & Elmqvist, N. (2019, June). Visualizing for the Non‐Visual: Enabling the Visually Impaired to Use Visualization. In Computer Graphics Forum (Vol. 38, No. 3, pp. 249-260).

This is a cool project. The aim of this paper is to extract information stored in a raster image and reconstruct the information into a data table that is accessible to, for example, blind populations.

1. This is a very cool study. The authors analyzed literature in different parts of the world throughout history from the perspective of romantic love. They found that economic development was strongly correlated with the increase in love fictions. https://www.nature.com/articles/s41562-022-01292-z.pdf

## 2022-03-11 (Completed on 2022-03-15) #

I read all contents on https://www.nature.com/collections/djdhcibbdh, which is basically about scientific work in, and collaborations between, the top five countries measured by NatureIndex, which tracks 82 high-impact natural science journals. It is a good idea to study the outcome of international collaborations and see what benefits they have.

## 2022-03-10 (Completed on 2022-03-14) #

1. Kong, H. K., Liu, Z., & Karahalios, K. (2019, May). Trust and recall of information across varying degrees of title-visualization misalignment. In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1-13).

This is a very innovative study.

This study found that 1) people think visualization is impartial even if its title is misaligned with the visualization, and 2) people’s recall of the message is more aligned with the title, not the visualization.

1. This is a very interesting study. The authors used AI to restore damaged ancient Greek inscription. https://www.nature.com/articles/s41586-022-04448-z

## 2022-03-09 (Completed on 2022-03-13) #

Krischkowsky, A., Fuchsberger, V., & Tscheligi, M. (2021, May). Making un-use: When humans disengage with technology. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-10).

This paper examines how people disengage with their digital technologies.

## 2022-03-08 (Completed on 2022-03-12) #

I quickly skimmed through a dozen papers. They are not easily understandable to me so I was not able to summarize them here.

## 2022-03-07 (Completed on 2022-03-10) #

1. Lim, B. Y., Dey, A. K., & Avrahami, D. (2009, April). Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 2119-2128).

This is a very innovative study.

This study looks at what explanations an intelligent system should give to the users. The authors compared two kinds of explanations: 1) why the system makes a specific decision, and 2) why not the system makes a specific decision. The results show that the first kind of explanations leads to better understanding of and higher trust in the system.

1. Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16), E3635-E3644.

The author looked at gender and racial stereotypes by analyzing texts in the US Census in the past 100 years using word embeddings.

## 2022-03-06 (Completed on 2022-03-09) #

This study finds that interaction data, both online and offline, such as phone calls and messaging, can still reveal personal data even after 20 weeks and even if the data is anonymized, if hackers link multiple data together.

## 2022-03-05 (Completed on 2022-03-08) #

1. Trisovic, A., Lau, M. K., Pasquier, T., & Crosas, M. (2022). A large-scale study on research code quality and execution. Scientific Data, 9(1), 1-16.

The authors reran R codes from more than 2000 replication datasets deposited in Harvard DataVerse. They found that after cleaning the codes, half the codes failed to rerun without errors. The authors also found that ggplot2 is the most popular library, which indicates that data visualization is a common task for researchers.

1. Kaltenegger, L., & Faherty, J. K. (2021). Past, present and future stars that can see Earth as a transiting exoplanet. Nature, 594(7864), 505-507.

This study is eye-opening. It analyzed past, present, and future stars that can see earth as a transiting planet around the sun. They concluded that 1,715 stars within 100 parsecs of the Sun are in the position to have witnessed life on Earth since the early stage of human civilization. Around 200 stars will have this opportunity in the next 5000 years.

## 2022-03-04 (Completed on 2022-03-07) #

Brülhart, M., Klotzbücher, V., Lalive, R., & Reich, S. K. (2021). Mental health concerns during the COVID-19 pandemic as revealed by helpline calls. Nature, 600(7887), 121-126.

This is a very cool study. The authors analyzed 8 million helpline calls during and before COVID-19 collected from 19 countries/regions. They found that calls during the pandemic were mostly driven by (1) fear of being infected, (2) being lonely and (3) physical health. The pandemic did not contribute to more calls about relationships, economic issues, violence or suicide attempts. The authors also found that financial support may help relieve psychological stress people face.

## 2022-03-03 (Completed on 2022-03-06) #

1. https://www.pnas.org/doi/10.1073/pnas.2117320119

This paper analyzed how black lives matter movement shifted public discourse on Twitter, newspapers, Google search, and Google Books. They found that terms associated with this movement became more popular in public discussions.

1. Peng, H., Ke, Q., Budak, C., Romero, D. M., & Ahn, Y. Y. (2021). Neural embeddings of scholarly periodicals reveal complex disciplinary organizations. Science Advances, 7(17), eabb9004.

This study uses network embedding to examine the citation networks of academic journals on Microsoft Academic Graph.

## 2022-03-02 (Completed on 2022-03-05) #

1. As the title indicates, around the world, prosociality is a predictor of a successful labor market. https://www.nature.com/articles/s41467-020-19007-1

2. This is a very interesting study. The authors found that those who sleep less than 7.5 hours per night at home tend to sleep more when traveling, whereas those who sleep more than 7.5 hours per night at home tend to sleep less when traveling. The figures in this paper are impressive. https://www.nature.com/articles/s41562-022-01291-0

## 2022-03-01 (Completed on 2022-03-03) #

1. Humans started processing ochre and using tools 40,000 years ago in China. https://www.nature.com/articles/s41586-022-04445-2

2. If we know more about other people, we feel that they also know about us (even if they don’t at all). This means that if we know more about a stranger, we are less likely to lie or to do inappropriate things. I resonate a lot with the finding of this paper. Oftentimes, I tend to know a lot about new friends. Then I assume that they also know a lot about me, which in turn makes me feel that our relationship is closer than it actually is. When I feel we are close, I might invite them to hang out together or even borrow money from them. Sometimes I get rejected. Then I got hurt. Now I know the culprit of my being hurt might be that I know more about them than they know about me. https://www.nature.com/articles/s41586-022-04452-3

3. This article analyzed and compared 500 editorials published in Science and Nature from 1966 to 2016. The results show that both journals have an increasing number of editorials on climate change with some peaks (1990; years before and including 2019; and 2015) in response to external events. The two journals also have differences in climate change editorials. Specifically, in the beginning Nature focused on government and policy whereas Science focused on energy and technology. But in recent years, Science, more than Nature is focusing on global policy. Journal history, local cultures, and readerships might be the sources of influences that shape how these two journals frame their climate change editorials. https://www.nature.com/articles/s41558-018-0174-1

# 2022-02 #

## 2022-02-27 (Completed on 2022-03-01) #

I read the daily news on https://phys.org/latest-news/. I don’t think it is a good source of academic news for me.

## 2022-02-26 (Completed on 2022-02-28) #

I explored some potential sources to get the latest articles I am interested in:

• https://www.nature.com/search?article_type=protocols,research,reviews&subject=social-sciences
• https://www.journals.elsevier.com/social-science-research/recent-articles
• https://phys.org/latest-news/
• https://www.science.org/news/all-news

## 2022-02-25 (Completed on 2022-02-27) #

1. Margulis, E. H., Wong, P. C., Turnbull, C., Kubit, B. M., & McAuley, J. D. (2022). Narratives imagined in response to instrumental music reveal culture-bounded intersubjectivity. Proceedings of the National Academy of Sciences, 119(4).

This paper finds that the stories conjured up by people after listening to a piece of music are similar if participants are from the same culture, and are different if they are from different cultures.

1. Cecchinato, M. E., Cox, A. L., & Bird, J. (2014, September). " I check my emails on the toilet”: Email Practices and Work-Home Boundary Management. ACM Conference on Human Factors in Computing Systems (CHI).

The authors find that email activities enabled by mobile technologies blurred people’s work and personal boundaries.

1. Lu, Y., & Pan, J. (2021). Capturing clicks: How the Chinese government uses clickbait to compete for visibility. Political Communication, 38(1-2), 23-54.

The study finds that Chinese local governments are using clickbaits to generate more reads for their social media posts.

## 2022-02-24 (Completed on 2022-02-26) #

1. Coppersmith, G., Leary, R., Crutchley, P., & Fine, A. (2018). Natural language processing of social media as screening for suicide risk. Biomedical informatics insights, 10, 1178222618792860.

This article detects people’s suicide risks from their social media posts using natural language processing.

1. Anderson, A., Huttenlocher, D., Kleinberg, J., & Leskovec, J. (2014, April). Engaging with massive online courses. In Proceedings of the 23rd international conference on World wide web (pp. 687-698).

This article is based on data from MOOCs at Stanford University. The authors find that 1) students have five distinct categories of behavior when studying MOOC courses, and 2) a badge system has an effect on student’s activity.

## 2022-02-23 (Completed on 2022-02-24) #

1. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

This paper is about a new language representation model called Bert. It’s too technical and I could not understand it.

1. Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).

This paper introduces a new model to represent words in vector space. It’s too technical for me and I could not understand it.

## 2022-02-22 (Completed on 2022-02-23) #

1. Liddy, E. D. (2001). Natural language processing .

1. Xu, J. M., Jun, K. S., Zhu, X., & Bellmore, A. (2012, June). Learning from bullying traces in social media . In Proceedings of the 2012 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 656-666).

This paper talks about how to detect social bullying utterances on social media using NLP techniques.

## 2022-02-21 (Completed on 2022-02-22) #

1. Ayres-Bennett, Wendy, Marco Hafner, Eliane Dufresne, and Erez Yerushalmi, The economic value to the UK of speaking other languages. Santa Monica, CA: RAND Corporation, 2022. https://www.rand.org/pubs/research_reports/RRA1814-1.html.

This report analyzes why studying other languages can help the UK economy.

1. Mønsted, B., & Lehmann, S. (2022). Characterizing polarization in online vaccine discourse—A large-scale study . PloS one, 17(2), e0263746.

The authors analyzed vaccine sentiment using a sample of 60 billion Tweets. The major findings of this study: - Content shared by anti-vaccine users are mostly commercial. - Vaccine debate on Twitter is polarized in a way that users interact mainly with similar users.

## 2022-02-20 (Completed on 2022-02-21) #

1. Goldberg, A. B., & Zhu, X. (2006, June). Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. In Proceedings of TextGraphs: The first workshop on graph based methods for natural language processing (pp. 45-52).

I skimmed through this paper. The authors developed a semi-supervised algorithm to infer ratings (for example, ratings on movie or Amazon items) based on very few labeled data. This algorithm performs better than all other algorithms.

## 2022-02-19 (Completed on 2022-02-20) #

1. Schmidt, A., & Wiegand, M. (2019, January). A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, April 3, 2017, Valencia, Spain (pp. 1-10). Association for Computational Linguistics.

The authors of this paper give a nice overview of hate speech detection techniques using NLP.

1. Xu, J. M., Huang, H. C., Bellmore, A., & Zhu, X. (2014, May). School bullying in twitter and weibo: a comparative study. In Eighth International AAAI Conference on Weblogs and Social Media.

The author analyzed Twitter and Weibo posts that are about school-bullying. The results show that there are fewer victim authors on Weibo. Also, Weibo posts mention families more.

## 2022-02-18 (Completed on 2022-02-19) #

Pan, J., & Chen, K. (2018). Concealing corruption: How Chinese officials distort upward reporting of online grievances . American Political Science Review, 112(3), 602-620.

This study analyzed a portion of leaked emails between a monitoring body and upper level officials in a prefecture in Southern China. The results show that the lower level monitoring body will omit some wrong doings when reporting public sentiment analysis to upper level officials. This means that China’s ability to gather information regarding citizens' complaints have systematic shortcomings.

## 2022-02-17 (Completed on 2022-02-19) #

Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261-266.

This article gives a very nice review of the development of natural language processing (NLP). It talks about the following topics in NLP: Machine Translation (MT), spoken dialogue and system conversational agents, machine reading, social media data mining, and sentiment analysis.

## 2022-02-16 (Completed on 2022-02-18) #

1. Bao, L., Krause, N. M., Calice, M. N., Scheufele, D. A., Wirz, C. D., Brossard, D., … & Xenos, M. A. (2022). Whose AI? How different publics think about AI and its social impacts. Computers in Human Behavior, 107182.

I skimmed through this study. The authors found that increases in rainfalls are associated with decreases in economic growth. They also found that rich countries, and service & manufacturing sections are affected most by rainfall increases.

1. Kotz, M., Levermann, A., & Wenz, L. (2022). The effect of rainfall changes on economic production. Nature, 601(7892), 223-227.

I skimmed through this paper. The authors found that US people’s attitudes towards AI can be grouped into five segments: negative, ambivalent, tepid, ambiguous, and indifferent.

1. Anderson, A., Huttenlocher, D., Kleinberg, J., & Leskovec, J. (2013, May). Steering user behavior with badges. In Proceedings of the 22nd international conference on World Wide Web (pp. 95-106).

I skimmed through this paper. The authors studied how badges in websites like Stack Overflow can lead to increases in user participation and user online activities.

## 2022-02-15 (Completed on 2022-02-17) #

Anderson, A., Goel, S., Huber, G., Malhotra, N., & Watts, D. J. (2014). Political ideology and racial preferences in online dating . Sociological Science, 1, 28.

This study examines how political alignment impacts racial preference in online dating by studying more than 250K online dating users in the United States. The authors conclude that conservatives, regardless of their own race and sex, prefer same-race partners, compared to liberals. Also, those who do not claim this preference act as if they do. As a result, the gap between conservatives and liberals in terms of preference for same-race partners, while still clear, is smaller.

Fig. 5 seems to indicate that there is no difference between conservatives and liberals in terms of same-race partner preference?

## 2022-02-14 #

Waller, I., & Anderson, A. (2021). Quantifying social organization and political polarization in online platforms . Nature, 600(7888), 264-268.

I skimmed through this paper. The authors used computational methods to study the representation of Reddit communities in terms of Age, Gender, and Partisanship. The analyses were based on over 1.5 billion comments in 10K communities within 14 years of Reddit history. The authors found that Reddit became more “right-wing” after the 2016 US Presidential election; this polarization was more driven by new users. The authors argue that their methods can be applied to the analyses of other online communities as well.

I have a question regarding this paper: How can you assume that all Reddit users are Americans? I personally know a Swedish friend who is using Reddit. If Reddit users are from all over the world, how can you infer the platform’s political partisanship?

## 2022-02-13 #

1. De Choudhury, M., Kiciman, E., Dredze, M., Coppersmith, G., & Kumar, M. (2016, May). Discovering shifts to suicidal ideation from mental health content in social media . In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 2098-2110).

I skimmed through this paper. The authors identified suicidal ideation from users' Reddit posts.

1. Blitzer, J., Dredze, M., & Pereira, F. (2007, June). Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification . In Proceedings of the 45th annual meeting of the association of computational linguistics (pp. 440-447).

I skimmed through this paper. The authors improved structural correspondence learning (SCL) algorithm in terms of detecting sentiment in Amazon reviews.

## 2022-02-12 #

1. Templeton, E. M., Chang, L. J., Reynolds, E. A., LeBeaumont, M. D. C., & Wheatley, T. (2022). Fast response times signal social connection in conversation. Proceedings of the National Academy of Sciences, 119(4).

I skimmed through this paper.

The authors found that faster response time in people’s conversations make people feel more connected.

1. De Choudhury, M., Gamon, M., Counts, S., & Horvitz, E. (2013, June). Predicting depression via social media. In Seventh international AAAI conference on weblogs and social media.

I very briefly skimmed through this paper. The authors first identified some Twitter users who were diagnosed with depression and then studied their tweets a year before their diagnosis. The authors found that social media can help predict depression. For example, depressive people might decrease their social activity and have greater ego networks.

1. De Choudhury, M., & De, S. (2014, May). Mental health discourse on reddit: Self-disclosure, social support, and anonymity. In Eighth international AAAI conference on weblogs and social media.

I skimmed through this paper.

The authors studied how Reddit users disclose their mental health issues and how they got social support.

## 2022-02-11 #

1. Castellucci, G. A., Kovach, C. K., Howard, M. A., Greenlee, J. D., & Long, M. A. (2022). A speech planning network for interactive language use. Nature, 1-6.

1. Chi, G., Fang, H., Chatterjee, S., & Blumenstock, J. E. (2022). Microestimates of wealth for all low-and middle-income countries. Proceedings of the National Academy of Sciences, 119(3).

The authors used machine learning to estimate wealth geographically for low- and middle-income countries around the world. These estimates can help the government target poor populations for humanitarian aid.

## 2022-02-10 #

1. Montag, C., Błaszkiewicz, K., Sariyska, R., Lachmann, B., Andone, I., Trendafilov, B., … & Markowetz, A. (2015). Smartphone usage in the 21st century: who is active on WhatsApp? . BMC research notes, 8(1), 1-6.

I skimmed through this paper.

The authors did a survey among around 2500 people (I guess all of them are German). The authors asked the participants to download a custom app that tracks their smartphone use. The study finds that on average people spend around 2.5 hours using smartphones each day. WhatsApp accounts for 1/5 (0.5 h) of total smartphone usage time. Female participants spend more time on WhatsApp each day than male.

1. Choi, V. K., Shrestha, S., Pan, X., & Gelfand, M. J. (2022). When danger strikes: A linguistic tool for tracking America’s collective response to threats . Proceedings of the National Academy of Sciences, 119(4).

I skimmed through this study. The authors developed a threat dictionary, and validated this dictionary using historical data. The authors found that the threat dictionary is sensitive to wars and conflict, pathogen, and natural disasters. The authors also found that changes in threats are associated with rising conservatism and greater approval of the sitting US presidents.

## 2022-02-09 #

1. Montag, C., Zhao, Z., Sindermann, C., Xu, L., Fu, M., Li, J., … & Becker, B. (2018). Internet communication disorder and the structure of the human brain: Initial insights on WeChat addiction . Scientific Reports, 8(1), 1-10.

This study finds that excessive use of WeChat is associated with changes in brain structure.

This study uses scales. I am not quite convinced of the results.

1. Király, O., Potenza, M. N., Stein, D. J., King, D. L., Hodgins, D. C., Saunders, J. B., … & Demetrovics, Z. (2020). Preventing problematic internet use during the COVID-19 pandemic: Consensus guidance . Comprehensive psychiatry, 100, 152180.

This article lists some suggestions for people in COVID-19 lockdown on how to prevent problematic use of the Internet, such as gambling, watching porn, excessive use of social media, etc.

I am not sure why this article got more than 400 citations on Google Scholar within 1.5 years of publication.

## 2022-02-08 #

1. Reid, T., & Gilbert, J. (2022). Inclusion in human–machine interactions . Science, 375 (6577), • DOI: 10.1126/science.abf2618

I really enjoyed reading this piece. It talks about potential problems in human-machine interactions (HMIs). - When building an airport, the developers will consider more the negative impacts (such as noises) on high-income communities as opposed to low-income communities. - Facial recognition works well for East Asian and White populations, but has problems for Black people. - It is problematic to deploy new technologies without letting people in the community know. For example, the use of DigDog by the New York police in New York. - Automatic weapons or robots used as weapons can be problematic.

1. Wardle, S. G., Paranjape, S., Taubert, J., & Baker, C. I. (2022). Illusory faces are more likely to be perceived as male than female . Proceedings of the National Academy of Sciences, 119(5).

I skimmed through this paper.

The authors find that people are more likely to perceive illusory faces as male than female. Gender associations with object name and color cannot explain this bias.

## 2022-02-07 #

1. Hessler, A. (2021). A path to independence

The author talks about her journey of becoming an independent researcher.

1. Davies, A., Veličković, P., Buesing, L., Blackwell, S., Zheng, D., Tomašev, N., … & Kohli, P. (2021). Advancing mathematics by guiding human intuition with AI . Nature, 600(7887), 70-74.

I skimmed through this article very briefly. It talks about how machine learning can help mathematicians discover new conjectures and theorems.

1. Chang, K. C., Hobbs, W. R., Roberts, M. E., & Steinert-Threlkeld, Z. C. (2022). COVID-19 increased censorship circumvention and access to sensitive topics in China . Proceedings of the National Academy of Sciences, 119(4).

I skimmed through this paper. The author found that during Covid-19, an increasing number of people in China circumvented the Great Firewall and got access to Twitter and Wikipedia for information censored in China.

## 2022-02-06 #

1. Jbaily, A., Zhou, X., Liu, J., Lee, T. H., Kamareddine, L., Verguet, S., & Dominici, F. (2022). Air pollution exposure disparities across US population and income groups . Nature, 601(7892), 228-233.

I skimmed through this paper. The authors find that White population and Native Americans are exposed to lower levels of PM2.5 compared to Black, Asian, and Hispanic populations. In terms of income, low income populations are at higher risk of PM2.5 exposure compared to high income people.

1. O’merar, S. (2019). China’s ambitious quest to lead the world in AI by 2030

I like the replies written by these scientists. The most inspiring one is that although competition in academia is fierce, it might not be a bad idea to talk with other academics. You never know what they went through to get the achievements you saw.

## 2022-02-05 #

Box-Steffensmeier, J.M., Burgess, J., Corbetta, M. et al. The future of human behaviour research. Nat Hum Behav 6, 15–24 (2022). https://doi.org/10.1038/s41562-021-01275-6

## 2022-02-04 #

1. NASA’s Webb telescope reaches deep space home

I am amazed by the science and technology available today: Scientists are able to operate and communicate with a telescope 1.5 million kilometers away. I am curious about where the telescope gets its energy. From sunlight?

1. Lessons learned from leading NIH

Francis Collins was the director of NIH for 12 years. He talked about achievements made by NIH these years and also lessons he learned over the years.

## 2022-02-03 #

1. Cohen, J. (2022). India’s pandemic toll far exceeds official count. Science.

As the title says, the real death number far exceeds the official count by the Indian Government.

1. Huston, M. (2022). Artificial intelligence unmasks anonymous chess players. Science.

A new study published in NeurIPS used AI to identify unique chess players based on their chess playing style. This finding has significant implications for privacy issues online. For example, given enough data, AI can identify unique drivers and social media users.

1. Cohen, J. (2022). The pandemic whistleblower. Science.

This piece talks about Dr. Bright. How he revealed the Trump Administration’s failures of Covid policies.

## 2022-02-02 #

Kloor, K. (2022). Paranormal activity . Science, 375 (6579), • DOI: 10.1126/science.ada0327

This is a nice feature story. It talks about Avi Loeb, a Harvard professor in astronomy who is now studying UFOs.

## 2022-02-01 #

1. Guglielmi, G. (2022). EU grants restrict U.K. and Swiss research . Science, 375 (6578), • DOI: 10.1126/science.ada0232

The political tensions between EU and Switzerland, and those between EU and the UK are affecting scientists in the UK and Switzerland. Scientists in these two countries who receive funding from the EU might have to relocate themselves in order to use the money.

1. Gibbons, A. (2022). Early migration may have spread Celtic languages . Science, 374 (6575), • DOI: 10.1126/science.acz9883

This news piece talks about a study published in Nature. That study finds that around 3000 years ago, there was an influx of people from France to Britain. That was probably when the Celtic language was brought from Europe to Britain.

# 2022-01 #

## 2022-01-31 #

1. Servick, K. (2022). Window of opportunity . Science, 375 (6578), • DOI: 10.1126/science.ada0099

This report talks about the dilemma facing researchers and medical experts. It is very difficult and unethical to peek into people’s brains if it is not because you have to do so for medical reasons. Therefore, when there is an opportunity to do brain surgeries, the doctors will want to do research during the same time. These are opportunities, but, at the same time, bring many ethical concerns. For example, the patients might feel urged to consent even though they are unwilling to participate in research.

1. Nelson, A, & Lubchenco, J. (2022). Strengthening scientific integrity . Science, 375 (6578), • DOI: 10.1126/science.abo0036

This piece talks about principles of scientific integrity at the White House Office of Science and Technology Policy (OSTP).

## 2022-01-30 #

1. Dobrovidova, O. (2022). Russia begins work on a national permafrost monitoring system . Science, 375 (6576), • DOI: 10.1126/science.acz9933

Permafrost thaws can lead to disasters, for example, collapses of buildings. Russia has the largest expanse of permafrost in the world, covering 2/3 of the nation. In response to this threat, Russia has started to build a national system to monitor permafrost thaws, which is expected to deliver data in 2023 or later. Some researchers say, however, that the investment in the system is far from enough to generate sufficient data to monitor the thawing.

1. Acquisti, A., Brandimarte, A., Hancock, J. (2022). How privacy’s past may shape its future . Science, 375 (6578), • DOI: 10.1126/science.abj0826

This article argues that to protect users' privacy, instead of relying on notice and consent (like what GDPR does), we should focus on developing and deploying privacy technologies.

## 2022-01-29 #

This news piece talks about the progress made by COVID-19 Vaccines Global Access (COVAX), an initiative that aims to vaccinate 20% of the population in every country.

This editorial talks about the verdict of Elizabeth Holmes, a former CEO of a start-up that is guilty of frauds.

## 2022-01-28 (Completed on 2022-01-29) #

1. Normile, D. (2022). China falls silent about its recruitment efforts .

This news piece talks about China’s Thousand Talents Program (TTP). The US claims that this program is for China to steal innovative ideas from the US. Some scientists say, however, that this program aims to build up academic programs in Chinese universities, and the exchanges between two cultures will be mutually beneficial.

1. Asche et al. (2022). China’s seafood imports—Not for domestic consumption?

The authors show that the majority (75%) of China’s seafood imports are processed and then exported, rather than domestically consumed. The authors argue that processing seafood locally is more desirable for sustainable development.

## 2022-01-27 (Completed on 2022-01-28) #

This story frightens me. I feel very insecure studying in the US after reading the story. I feel so fragile.

This story is about why US federal allegations into Gang Chen, an MIT engineering professor, failed. This is largely because these allegations were not based on facts.

## 2022-01-26 #

Finished Willett et al. (2021)

This paper draws inspiration for future visualization from superpower comics. For example, a future visualization system might let people see through things or count many things instantly.

## 2022-01-25 #

1. Santos, F. P., Lelkes, Y., & Levin, S. A. (2021). Link recommendation algorithms and dynamics of polarization in online social networks . Proceedings of the National Academy of Sciences, 118(50).

I skimmed through this study.

The authors find that recommendation algorithms that recommend similar people contribute to opinion polarization. If the algorithm recommends dissimilar people, polarization might be curbed.

1. Willett, W., Aseniero, B. A., Carpendale, S., Dragicevic, P., Jansen, Y., Oehlberg, L., & Isenberg, P. (2021). Perception! Immersion! Empowerment! Superpowers as Inspiration for Visualization . IEEE Transactions on Visualization and Computer Graphics, 28(1), 22-32.

PP. 1-4

## 2022-01-24 (Completed on 2022-01-25) #

1. Kanngiesser, P., Schäfer, M., Herrmann, E., Zeidler, H., Haun, D., & Tomasello, M. (2022). Children across societies enforce conventional norms but in culturally variable ways . Proceedings of the National Academy of Sciences, 119(1).

I skimmed through this study.

This study looks at whether children would correct other children’s behavior if they have different rules in mind. The results show yes: Children intervene and correct others more frequently when they have followed different rules than when they have the same rules. The magnitude and style of intervention vary across cultures.

## 2022-01-23 #

Liu, J., Tang, T., Wang, W., Xu, B., Kong, X., & Xia, F. (2018). A survey of scholarly data visualization . Ieee Access, 6, 19205-19221.

This paper lists resources on how to visualize scholarly data.

## 2022-01-22 #

Meho, L. I., & Rogers, Y. (2008). Citation counting, citation ranking, and h‐index of human‐computer interaction researchers: a comparison of Scopus and Web of Science . Journal of the American Society for Information Science and Technology, 59(11), 1711-1726.

The authors compared Web of Science with Scopus as a source of citation analysis for the field of HCI. The results show that Scopus has a wider coverage than WoS because the former indexes many conference proceedings that were not part of WoS. The authors conclude that it is inappropriate to use WoS as the sole source for citation analysis for HCI researchers.

This study is very old: it was published 14 years ago. I am not sure whether the results still hold water today.

## 2022-01-21 #

Healy, K., & Schussman, A. (2003). The ecology of open-source software development . Technical report, University of Arizona, USA.

The authors found that the popularity of open-source software follows a power law such that a tiny number of software received most of the attention (for example, web page visits and downloads).

The authors also found that open source software development is mostly a single person’s performance: the median number of contributors to open source software is 1.

In the Discussion section, the authors hypothesized that the structure of open source software development is very centralized; that is to say, a key person plays a crucial part.

## 2022-01-20 #

Healy, K. (2017). Fuck nuance . Sociological Theory, 35(2), 118-127.

The author argues that it is not beneficial to have too many details in sociological theory. Theory is abstraction and abstraction means it may not apply to individual things.

## 2022-01-19 #

Ahn, Y. Y., Bagrow, J. P., & Lehmann, S. (2010). Link communities reveal multiscale complexity in networks . nature, 466(7307), 761-764.

This paper is too dense for me and I couldn’t fully understand it. I got the key idea: treating communities as groups of links is better than as groups of nodes. This is because nodes may belong to different groups, which makes it difficult to infer relationships between overlapping groups from the hierarchy of nodes. Treating communities as groups of links solves this problem.

## 2022-01-18 #

1. Finished Chen et al. (2021)

The authors detailed how they collected all the figures and tables in all the full papers in IEEE VIS of the past 30 years (1990-2019).

1. Liu, Y., Goncalves, J., Ferreira, D., Xiao, B., Hosio, S., & Kostakos, V. (2014, April). CHI 1994-2013: Mapping two decades of intellectual progress through co-word analysis . In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 3553-3562).

The authors analyzed keyword networks in CHI proceedings of 1994-2013. They found that HCI is a diverse field, and some keywords merged. They also found that HCI research dealt with new areas as new technologies emerged.

## 2022-01-17 #

1. Lehmann, S., Jackson, A., & Lautrup, B. (2008). A quantitative analysis of indicators of scientific performance . Scientometrics, 76(2), 369-390.

I skimmed through this paper.

The authors find that we need 50 papers to determine a scientist’s performance.

1. Chen, J., Ling, M., Li, R., Isenberg, P., Isenberg, T., Sedlmair, M., … & Wang, Q. (2021). VIS30K: A collection of figures and tables from IEEE visualization conference publications . IEEE Transactions on Visualization and Computer Graphics.

PP. 1-5

## 2022-01-16 #

Choe, E. K., Lee, N. B., Lee, B., Pratt, W., & Kientz, J. A. (2014, April). Understanding quantified-selfers' practices in collecting and exploring personal data . In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1143-1152).

The authors did a qualitative analysis of 52 Quantified-Selfers Meetup videos. The average length of the videos is around 15 minutes. The authors analyzed (1) what do Q-Selfers track, (2) why they track these, (3) what tool they use, (4) mistakes they made when tracking, (5) how they collect, analyze, and visualize their data, (6) what insights they obtained through self tracking, and (7) what obstacles they encountered in self tracking. Based on these, the authors proposed some suggestions for HCI researchers to help Q-Selfers. For example, to build a tool where people can see the analysis and visualization in real time.

## 2022-01-15 #

Cha, M., Kwak, H., Rodriguez, P., Ahn, Y. Y., & Moon, S. (2009). Analyzing the video popularity characteristics of large-scale user generated content systems . IEEE/ACM Transactions on networking, 17(5), 1357-1370.

This paper analyzed User Generated videos on YouTube, and Daum Videos of South Korea, and non-UGC videos on Netflix, Lovefilm, and Yahoo! Movies. The authors performed various analyses on the features of these videos.

## 2022-01-14 #

👍 Brietzke, S., & Meyer, M. L. (2021). Temporal self-compression: Behavioral and neural evidence that past and future selves are compressed as they move away from the present . PNAS

The authors find that our past and future selves are temporally compressed in a way that our more distant selves are increasingly similar to each other compared to current and nearby selves. Put it in another way, the further our selves are away from our current self, the less discriminable they become.

The results suggest that representations of our past and future selves might be collectively stored in the same brain regions.

## 2022-01-13 #

1. Nilizadeh, S., Groggel, A., Lista, P., Das, S., Ahn, Y. Y., Kapadia, A., & Rojas, F. (2016, March). Twitter’s glass ceiling: The effect of perceived gender on online visibility . In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 10, No. 1).

I skimmed through this paper.

The authors examined close to 100 thousand Twitter users. They find that being perceived as a female based on the displayed first name is beneficial among users who don’t have a lot of visibility. For the highest quantitle of visibility, being perceived as a female is unfavorable.

Don’t look down on simplicity; good research is often simple.

A good deal of research is spontaneous and social, arising from interactions with your peers or advisor.

Prepare well for every meeting you have with your advisor! Make notes during the meeting! Document the key points after the meeting!

Successful research comes from having a good understanding, especially of the basics.

Instead of focusing on your thesis, “try to do good research and get recognition in the research community.” Thesis is “only a formality for the university and less than you need.”

It’s never too late to change areas. If you don’t like where you are now, switching to a different area or advisor can make a marked difference. Try to explore before settling down.

## 2022-01-12 #

Moss-Racusin, C. A., Dovidio, J. F., Brescoll, V. L., Graham, M. J., & Handelsman, J. (2012). Science faculty’s subtle gender biases favor male students . Proceedings of the national academy of sciences, 109(41), 16474-16479.

The authors find that 127 professors in the field of biology, chemistry, and physics in the United States, regardless of their own genders, after reviewing the application material to a lab manager position, rated male applicants as more competent, hireable, and worthy of more mentoring. They also gave male applicants higher salaries than female applicants. The application material is the same; the only difference is the gender of the name associated with the application.

It should be noted that professors responded that they like female applicants more. This may suggest that the biases against women in science is unintentional. Cultural stereotypes might be the reason.

## 2022-01-11 #

1. Finished Huszár et al. (2022)

I skimmed through this paper.

The authors analyzed Tweets activities of over 3,600 accounts of legislators in seven countries (US, Canada, France, Germany, Japan, Spain, and UK). The main results are:

• Twitter’s recommender system favors the political right wing in all seven countries except for Germany.
• The recommender system does not seem to favor extreme ideologies.
1. Bernstein, M. S., Levi, M., Magnus, D., Rajala, B. A., Satz, D., & Waeiss, C. (2021). Ethics and society review: Ethics reflection as a precondition to research funding . Proceedings of the National Academy of Sciences, 118(52).

I skimmed through this paper.

This paper presents the result of a 1-year implementation of Ethics and society review (ESR) at Stanford University. The idea is that IRB discourages considering a research project’s long-term impact on human society; they only focus on its impact on human subjects (sometimes animals as well). ESR fills this gap. ESR requires researchers to report their research projects' possible risks for the human society (and mitigation methods) to get the funding. After one year’s implementation, the result is that all researchers are willing to continue with it despite the added commitment.

## 2022-01-10 #

1. Cui, J., Wang, C., Zhang, J., & Zheng, Y. (2021). The effectiveness of China’s regional carbon market pilots in reducing firm emissions . Proceedings of the National Academy of Sciences, 118(52).

I skimmed through this paper.

Using data from firms, the authors analyzed the effect of China’s emission trading system (ETS) which aims to decrease China’s greenhouse gas emissions. The authors conclude that: - Carbon price plays a key role. If China can increase its carbon price, more emission reduction can be achieved. - If China wants to achieve its goal of 2030 (greenhouse gas emissions will peak in that year and then decline), it needs to implement a mass-based rule, which exerts an emission cap, rather than a rate-based rule.

1. Huszár, F., Ktena, S. I., O’Brien, C., Belli, L., Schlaikjer, A., & Hardt, M. (2022). Algorithmic amplification of politics on Twitter . Proceedings of the National Academy of Sciences, 119(1).

PP. 1-2

## 2022-01-09 #

Groh, M., Epstein, Z., Firestone, C., & Picard, R. (2022). Deepfake detection by human crowds, machines, and machine-informed crowds . Proceedings of the National Academy of Sciences, 119(1).

I skimmed through this paper.

The authors compared the accuracy of detecting deepfake videos by humans and by the leading computer vision model. The results show that humans are as accurate as and sometimes more accurate than machines. Humans are better at standard quality videos whereas machines are better at blurry or very dark videos. When the video contains two actors, humans perform better.

The authors did not find that humans' accuracy improved as they watched more videos. Also, anger decreased humans' accuracy at detecting real videos.

## 2022-01-08 #

Finished Miao & Chan. (2021)

Using domestication theory and based on interviews with three Chinese gay men from different classes and generations, this paper analyzed how these three people use Blued in their life.

## 2022-01-07 #

1. Holland, K. J., Hutchison, E. Q., Ahrens, C. E., & Torres, M. G. (2021). Reporting is not supporting: Why mandatory supporting, not mandatory reporting, must guide university sexual misconduct policies . Proceedings of the National Academy of Sciences, 118(52).

This article argues that mandatory reporting of sexual misconduct in universities should be replaced with mandatory supporting. Mandatory reporting is bad because it leaks survivors' private information without their consent, and may also make teaching and research on sexual misconduct difficult, if not impossible.

1. Miao, W., & Chan, L. S. (2021). Domesticating Gay Apps: An Intersectional Analysis of the Use of Blued Among Chinese Gay Men . Journal of Computer-Mediated Communication, 26(1), 38-53.

PP. 1-8

## 2022-01-06 #

Finished Block et al. (2021)

This study looks at the possibility of the public and political elites to respond to emails sent by White names and Black names. This study employs a within subject design; every person receives two emails, one from a White name and the other from a Black name. The results show that both the public and elected officials discriminate against Black senders.

Among the public, 1.6% responded to the White sender. The figure for Black sender is 1.4%. Among the elected officials, 4.2% responded to the White sender, whereas 3.9% responded to the Black sender.

Elected officials discriminate against Black people less than the public, but the difference is not statistically significant.

Fig.2 shows the breakdown of ethnicity of email recipients. It shows that only Black people do not discriminate Black people.

## 2022-01-05 #

1. Sèbe, M., & Gourguet, S. (2022). Opinion: To save whales, look to the sky . Proceedings of the National Academy of Sciences, 119(1).

The authors suggest that the International Maritime Organization (IMO) learn from the International Civil Aircraft Organization (ICAO) in terms of avoiding collision with animals. Whale-ship collisions lead to hundreds, and possibly thousands, of whale deaths. ICAO has an extensive database of aircraft-bird collisions (detailed reports for 150,000 incidents). The authors suggest that IMO do the same. The authors also proposed several other measures, which are listed at the end of the article.

To me, it is interesting that the authors compared traffic in the sea and that in the air. They are parallel and comparable.

1. Block, R., Crabtree, C., Holbein, J. B., & Monson, J. Q. (2021). Are Americans less likely to reply to emails from Black people relative to White people? . Proceedings of the National Academy of Sciences, 118(52).

PP. 1-3

## 2022-01-04 #

1. Gross, K., & Bergstrom, C. T. (2021). Why ex post peer review encourages high-risk research while ex ante review discourages it . Proceedings of the National Academy of Sciences, 118(51).

I skimmed through this paper.

The authors used math models and simulations to study whether proposal-based (ex ante) peer review and outcome-based (ex post) peer review lead to different questions that scientists decide to study. The authors find that ex ante peer review makes scientists less likely to pursue risky questions whereas ex post peer review encourages risky questions.

To promote open science, there is a movement in the scientific community to let researchers submit proposals to journals before they start their study. Journals decide whether to accept it based on the proposals, not the results. This practice encourages open science and deters p-hacking. On the other hand, as this paper argues, it might also make researchers less likely to pursue risky projects.

I am wondering whether we can study the differences between the two modes of peer review through experiments.

## 2022-01-03 #

Scheffer, M., van de Leemput, I., Weinans, E., & Bollen, J. (2021). The rise and fall of rationality in language . Proceedings of the National Academy of Sciences, 118(51).

The authors analyzed word frequencies in Google nGram data covering books from 1850 to 2019. They find that relative to sentiment words, frequency of rationality-related words increased from 1850 to the end of 20th century. Then frequency of rationality-related words decreased relative to sentiment words. This pattern is also observed in the New York Times corpra. Since 2004, Google search query results show similar patterns.

## 2022-01-02 (Completed on 2022-01-03) #

1. Shaffer, L. (2021). Inner Workings: Using vaccines to harness the immune system and fight drugs of abuse . Proceedings of the National Academy of Sciences, 118(52).

Researchers are developing vaccines that can target molecules of drugs of abuse before they reach the brain. However, there are difficulties. For example, people have various antibody generation patterns, and vaccines that work for one drug might fail to work for another.

Menstrual irregularity makes detection of pregnancy harder. Using data from a commercial app where people can record menstrual cycles, the authors of this paper find that those people have higher risks for irregular menstrual cycles:

• Those who report polycystic ovary syndrome (PCOS), diabetes, obesity, etc
• Hispanic women

In some states, laws disallow abortion after the dection of fetel “hearbeat”, which usually occurs 6 weeks after the last menstruation. To be eligible for abortion, people need to detect pregnancy early one. However, the earliest symptoms of pregnancy–irregular menstruation–is missed, especially for those who have irregular menstrual cycles.

## 2022-01-01 #

Finished Balietti et al. (2021)

The authors matched people based on non-political similarities. A matched group might have similar or different political stances. In each matched group, one person read a short and argumentative essay about wealth redistribution in the US. The authors want to see how similarity in non-political interests (high, and low), political stances (high similarity, and low similarity), and the interactions between the two variables, influenced opinion updates. I feel the study is very complicated.

Major findings:

• The increase caused by reading an essay by a matched person in favor of redistribution is higher than the decrease caused by reading an essay by a matched person against redistribution.

• Polarization for participants with mild views decreased regardless of their matched person’s political stance whereas participants holding strong views became more extreme if they were in the “same stance” group.

• Feelings of closeness are associated with greater belief updates. This effect of closeness is smaller for those with extreme opinions.

• Interaction with people with different opinions decreased closeness; Interaction with people with similar opinions increased closeness. The decrease is greater than the increase.

# 2021-12 #

## 2021-12-31 #

Balietti, S., Getoor, L., Goldstein, D. G., & Watts, D. J. (2021). Reducing opinion polarization: Effects of exposure to similar people with differing political views . Proceedings of the National Academy of Sciences, 118(52).

PP. 1-7

## 2021-12-30 #

Finished The Good Research Code Handbook

• Write small functions.
• Stick to python coding conventions: use underscores.
• Make your codes self-explanatory so they don’t even need documentation.

## 2021-12-29 #

Batty, E. (2021). The Good Research Code Handbook

## 2021-12-28 #

1. Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Iii, H. D., & Crawford, K. (2021). Datasheets for datasets . Communications of the ACM, 64(12), 86-92.

The authors suggest that datasets for machine learning have a datasheet which contains:

• motivation
• composition
• collection process
• preprocessing/cleaning/labeling
• uses
• distribution
• maintenance
1. Cao, Y. T., & Daumé III, H. (2019). Toward gender-inclusive coreference resolution . arXiv preprint arXiv:1910.13913.

I skimmed through it. Could not understand it.

## 2021-12-27 #

1. Finished Viswanath et al. (2009)

I skimmed through this paper.

This paper examines the evolution of activity network on Facebook. The sample came from FaceBook users from New Orleans. The researchers find that individual links in activity networks changed rapidly over time. The structure of the network, at least in terms of (1) average user degree, (2) clustering coefficient, and (3) average path length, remain stable.

1. Alkuraya, F. (2021). A genetic revolution in rare-disease medicine Nature News

I skimmed through it.

The author argues that genomics has been improving people’s lives.

I have finished all articles of the type of “Nature News' ‘. In the beginning, I thought reading those kinds of papers will give me quicker and easier information about various fields. After all, I can read other people’s recap of a major paper published on Nature. However, after reading more than 30 papers of these, I feel this is not a good idea. Reading the original paper is better.

## 2021-12-26 #

1. Finished Wickham (2014)

Tidy data: “each variable is a column, each observation is a row, and each type of observational unit is a table”

1. Viswanath, B., Mislove, A., Cha, M., & Gummadi, K. P. (2009, August). On the evolution of user interaction in facebook . In Proceedings of the 2nd ACM workshop on Online social networks (pp. 37-42).

PP. 1-3

## 2021-12-25 #

1. Finished Buckee et al. (2021)

This paper is too abstract. I didn’t understand what it was saying. Below are some points I got:

1. Mobile phone data did not provide much disaggregated information. For example, it does not disaggregate gender and occupation. If we consider gender and occupation, the mobility pattern might be very different from that generated from the aggregated data.

2. Instead of having one large model for all disease outbreaks, it is better to have different models for different outbreaks in different contexts.

3. Wickham, H. (2014). Tidy data . Journal of statistical software, 59(1), 1-23.

PP. 1-8

## 2021-12-24 #

1. Finished Wickham (2011)

I skimmed through it. The paper is too technical for me.

1. Continued with Buckee et al. (2021)

PP. 2-6

## 2021-12-23 #

1. Finished Aref et al. (2019)

The authors analyzed the mobility pattern of authors who had main affiliation in at least three countries. The data, which contains 62 million publications, is from Web of Science. More than 90% of authors in these publications did not move internationally. The final dataset is 1.7 million authors who moved globally (as defined above).

These are the main results:

1. The USA and China are the two hubs for highly mobile scholars, followed by England and Germany.

2. China is the top 1 destination for super-movers.

3. The USA is the top 1 destination for early-career super-movers, whereas China is the top 1 destination for intermediate and senior super-movers.

4. Although China is one the two hubs, it is not an important (ranked only 18th globally) node in the paths of all super-movers. That is to say, China is not a “connector”.

5. Wickham, H. (2011). The split-apply-combine strategy for data analysis . Journal of statistical software, 40(1), 1-29.

PP. 1-5

## 2021-12-22 #

1. Jaidka, K., Zhou, A., & Lelkes, Y. (2019). Brevity is the soul of Twitter: The constraint affordance and political discussion . Journal of Communication, 69(4), 345-372.

I skimmed through this paper.

The author studied close to 360K Twitter replies to US Congressmen and Congresswomen. The compared replies before and after the 280-character limit change from the perspective of linguistic features, such as incivility, politeness, respect, etc. The results show that doubling the number of characters allowed in a tweet made the political discussions less uncivil and more deliberate. This change in character limit, however, decreased empathy and respect.

The data and codes for the figures in this paper is here .

1. Aref, S., Zagheni, E., & West, J. (2019, November). The demography of the peripatetic researcher: Evidence on highly mobile scholars from the Web of Science . In International Conference on Social Informatics (pp. 50-65). Springer, Cham.

PP. 1-6

## 2021-12-21 #

Jiang, M. (2014). The business and politics of search engines: A comparative study of Baidu and Google’s search results of Internet events in China . New media & society, 16(2), 212-233.

The author compared the search results of 316 internet events of 2009 on Baidu and Google. She compared the results from the perspectives of accessibility (whether the links can be opened), overlapping, ranking, and bias. She only considered the top ten results for each query.

The results show that (1) after Google moved its server from Mainland China to Hong Kong, its results are as inaccessible, if not more, as those of Baidu, partly due to the Great Firewall and bad links; (2) The results of Google and Baidu are very different, with a very low overlapping rate and different rankings; (3) Baidu’s results are biased towards its own services (i.e., against its competitors like Hudong Baike).

## 2021-12-20 #

Heider, F., & Simmel, M. (1944). An experimental study of apparent behavior . The American journal of psychology, 57(2), 243-259.

The authors presented moving pictures, i.e., a short film to three groups of female undergraduate students. The film consists of movements of three objects: a large triangle, a small triangle, and a small circle. There is also a rectangle that can be opened and closed. This rectangle does not move.

The first group of students simply watched this film and were asked to describe it. The second group watched it and were asked to interpret movements as those of persons. The third group had the same instructions as the second one; the only difference is that they watched the film in reverse.

The results show that no matter whether participants were told to interpret the movements as those of persons’, they almost all tended to do so. The authors argue that as soon as we consider moving objects as persons, “perception of motive or need is involved.”

## 2021-12-19 #

1. 👍 Finished Segel & Heer (2010)

I skimmed through this paper.

This study examined 58 examples of narrative visualizations, taken from online journalism, business, and visualization research. Based on these examples, the authors came up with seven genres of narrative visualization: magazine style, annotated chart, partitioned poster, flow chart, comic stric, slide show and film/video/animation. The authors also pointed out that a promising future research direction is to study users' experiences and engagement when interacting with narrative visualizations.

1. Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B. J., Wong, K. F., & Cha, M. (2016). Detecting rumors from microblogs with recurrent neural networks .

I skimmed through this paper.

The authors used a recurrent neural networks (RNN) model to detect rumors on Twitter and Weibo.

## 2021-12-18 #

1. DeVito, N. J., Richards, G. C., & Inglesby, P. (2020). How we learnt to stop worrying and love web scraping . Nature, 585(7826), 621-623.

The authors briefly introduced how web scrapers work and why web scraping should be embraced as an important tool for scientific research.

1. Segel, E., & Heer, J. (2010). Narrative visualization: Telling stories with data . IEEE transactions on visualization and computer graphics, 16(6), 1139-1148.

PP. 1-5

## 2021-12-17 #

Yang, T., Ticona, J., & Lelkes, Y. (2021). Policing the Digital Divide: Institutional Gate-keeping & Criminalizing Digital Inclusion . Journal of Communication, 71(4), 572-597.

This is a very cool study. The authors were interested in whether free Wi-Fi offered by restaurants increased the incidents of quality-of-life crime reporting, and how that increase (if there is one) interacts with race and income.

The authors then collected data (restaurants with free Wi-Fi, and crime records) of Chicago. Their analysis shows that free Wi-Fi does not lead to an increase in quality-of-life policing. However, it does lead to an increase in affluent areas, and White areas, but not in poorer areas and non-White areas.

## 2021-12-16 #

👍 Finished Jung et al. (2021)

This is very interesting and meaningful work! The authors first gathered and summarized major guidelines for alternative text (alt text). The authors pointed out that these guidelines do not provide empirical evidence of their rationale.

The authors then went on collecting visualizations from news websites and surprisingly found that none of them include alt texts. Then they turned to scientific visualizations on publications of IEEE Vis, ACM ASSETS, and ACM CHI.

In the second phase of the study, the authors interviewed 21 blind people and 1 person with low vision. Each participant saw four visualizations without alt texts and was asked to describe what they found. Later, they were given the alt texts.

Based on the interview, the authors found that blind people tend to visualize in their head when they interact with visualizations. Therefore, providing necessary information, such as chart type, color, axes, etc, can relieve their cognitive burden when constructing a mental image of the visualization.

The authors then provided guidelines for alt text generation based on their interview results.

I am wondering how blind people interact with interactive visualizations.

## 2021-12-15 #

Jung, C., Mehta, S., Kulkarni, A., Zhao, Y., & Kim, Y. S. (2021). Communicating Visualizations without Visuals: Investigation of Visualization Alternative Text for People with Visual Impairments . IEEE Transactions on Visualization and Computer Graphics.

PP. 1-7

## 2021-12-14 #

1. Finished Yang (2021)

2. Pennisi, E. (2021). Getting the big picture of biodiversity . Science

This news piece talks about remote sensing. Satellites, planes, drones, etc, can capture sunlight reflected off tree leaves. These data can be used to measure five of the six essential biodiversity variables (EBVs): species by color, tree height, diversity maps, land cover, and usage. Remote sensing has also been used to monitor animals, for example, birds and penguins.

When combined with ground measurements, remote sensing can be more powerful and accurate. For example, a team from Yale University is building computer models based on both remote sensing data and ground measurements, for example, climate and vegetation. These models can be used to predict the location where a given animal species can be found.

## 2021-12-13 #

Yang, G. (2021). Online lockdown diaries as endurance art . Ai & Society, 1-10.

• Online lockdown diaries need endurance
• Wuhan: unique opportunity to record what was going on
• Three types of endurance: (1) living with the unknown, (2) self-doubt, and (3) ephemeral cyberspace, and censorship

PP. 1-7

## 2021-12-12 #

Isenberg, P., Heimerl, F., Koch, S., Isenberg, T., Xu, P., Stolper, C. D., … & Stasko, J. (2016). vispubdata. org: A metadata collection about IEEE visualization (VIS) publications. IEEE transactions on visualization and computer graphics, 23(9), 2199-2206.

The authors collected and cleaned publication data of IEEE Vis papers. They detailed the data collection and cleaning process. The major difficulty is name disambiguation since one author name can have multiple variants, and might also change due to marriage. The authors found that the official IEEE Xplore library missed some information.

The dataset is available at vispubdata.org. The authors also made three visualizations based on this dataset. The visualizations are available at https://www.cc.gatech.edu/gvu/ii/citevis/VIS25/.

## 2021-12-11 #

1. Chang, S., Pierson, E., Koh, P. W., Gerardin, J., Redbird, B., Grusky, D., & Leskovec, J. (2021). Mobility network models of COVID-19 explain inequities and inform reopening . Nature, 589(7840), 82-87.

I skimmed through this paper.

Using mobility data (from SafeGraph) of 90 million people in the USA, the authors predicted infection cases with a simple susceptible-exposed-infectious-removed (SEIR) model. The model was able to predict the evolution of daily confirmed cases in ten large cities of the USA from early March to early May, despite the changing policies and changing human behavior during this period.

The model confirmed that people from disadvantaged groups in terms of racial and income are more vulnerable to COVID infections. This is because they visited denser locations and stayed longer. This finding has important implications for public policies. As cities are thinking of reopening, governments can implement policies helping reduce infection rates in low-income areas.

1. Buckee, C., Noor, A., & Sattenspiel, L. (2021). Thinking clearly about social aspects of infectious disease transmission . Nature, 595(7866), 205-213.

PP. 1-2

## 2021-12-10 (Completed on 2021-12-11) #

Finished Lee et al. (2021)

I skimmed through this paper very carelessly. It debunked some myths regarding the effects of caloric restriction (CR) diets on prolonging lifespan.

## 2021-12-09 (Completed on 2021-12-11) #

1. Spyrison, N., Lee, B., & Besançon, L. (2021). “Is IEEE VIS* that* good?” On key factors in the initial assessment of manuscript and venue quality .

To decide whether to read a paper in detail, among the 46 respondents, (1) publication venue prestige, and (2) publication year, i.e., whether it’s recent or not, are the most important factors.

The authors argued that we should abandon the idea that the prestige of a publication venue dictates the value of the paper.

1. Lee, M. B., Hill, C. M., Bitto, A., & Kaeberlein, M. (2021). Antiaging diets: Separating fact from fiction . Science, 374(6570), eabe7365.

PP. 1-3

## 2021-12-08 (Completed on 2021-12-11) #

Isenberg, P., Isenberg, T., Sedlmair, M., Chen, J., & Möller, T. (2016). Visualization as seen through its research paper keywords . IEEE Transactions on Visualization and Computer Graphics, 23(1), 771-780.

I skimmed through this paper.

This paper employed co-word analysis to analyze keywords in IEEE Vis papers from 1990 to 2015. In terms of IEEE Vis, there are two types of keywords: those appeared in actual papers in PDF, and those appeared on IEEE Xplore website. The difference is that authors are free to decide their own keywords and put them in their papers, but when they submit their paper on IEEE Vis platform, they have to choose from keywords already defined by IEEE Vis. The second type is called PCS keywords. Based on these two types of keywords, the authors manually created topics.

This paper is mainly about two questions: (1) what are the key themes among IEEE Vis papers and what are the relationships among them; (2) how do certain keywords emerge and evolve.

Later, the authors analyzed “topic-coded keywords” and PCS keywords from two perspectives: cluster (how they cluster), and networks (how they relate to each other). The results indicated that topic keywords lack mainstream topics but PCS keywords have mainstream topics.

The authors also mentioned that keywords inputted by paper authors have unnecessary duplicates, for example, singular vs. plural. This is not helpful. The authors also mentioned a fixed list of keywords. But this list is always evolving and it’s a good question to ask whether we can automate the updating process.

keyvis.org is one of the outputs of this paper where people can search keywords appearing in IEEE Vis papers.

## 2021-12-07 (Completed on 2021-12-08) #

👍 Cha, M., Haddadi, H., Benevenuto, F., & Gummadi, K. (2010, May). Measuring user influence in twitter: The million follower fallacy . In Proceedings of the international AAAI conference on web and social media (Vol. 4, No. 1).

I skimmed through this paper.

This study analyzed 6 million Twitter users. The authors distinguished between three different types of influencers in Twitter: indegree, retweet, and mentions. The authors showed that:

1. Indegree influence does not necessarily translate into retweets and mention influences.
2. Most influential people have considerable influences over various topics.
3. People gain influence through concerted efforts, for example, creating content on one single topic, rather than through accidents.

## 2021-12-06 (Completed on 2021-12-07) #

1. Wittenberg, C., Tappin, B. M., Berinsky, A. J., & Rand, D. G. (2021). The (minimal) persuasive advantage of political video over text . Proceedings of the National Academy of Sciences, 118(47).

This study finds that although videos, compared to texts (annotated transcripts of the videos), are more likely to make people believe something actually existed or occurred, they are not more persuasive nor more engaging, at least not in a political context.

This research contains two studies, both employing a within-subject design. Study one contains 48 persuasive messages covering a wide range of topics and Study two has 24 messages about COVID-19.

The authors stated that belief is not equal to persuasion. They said that scholars outside of political science should also pay attention to this point.

1. I am not sure whether a participant only saw one message or all the messages.
2. I am not sure why the authors decided to employ a within-subject design. It seems to me that having watched the video clip will definitely affect the outcome of reading transcriptions.

## 2021-12-05 (Completed on 2021-12-07) #

👍 Tovanich, N., Dragicevic, P., & Isenberg, P. (2021). Gender in 30 Years of IEEE Visualization . IEEE Transactions on Visualization and Computer Graphics.

This paper analyzed all the authors in IEEE Visualization (a conference in the field of Visualization) from a perspective of gender. They examined the overall gender representation, career age, dropout rate, author positioning, collaboration networks, and paper awards.

The key findings can be found at the end of this paper. Here are two findings I especially noted:

1. On average, it takes more years for an author to publish their first paper at IEEE Vis.
2. There is a gender bias in author collaboration in IEEE Vis.

## 2021-12-04 (Completed on 2021-12-05) #

Bartneck, C., & Hu, J. (2009, April). Scientometric analysis of the CHI proceedings . In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 699-708).

Based on all CHI proceedings, including extended abstracts, from 1982 to 2008, this study examines (1) which countries and organizations (i.e., universities, institutes, companies, etc) contributed most to these papers, and (2) whether papers receiving Best Paper Awards received more citations than a randomly selected paper.

## 2021-12-03 (Completed on 2021-12-05) #

Levers, C., Romero-Muñoz, A., Baumann, M., De Marzo, T., Fernández, P. D., Gasparri, N. I., … & Kuemmerle, T. (2021). Agricultural expansion and the ecological marginalization of forest-dependent people . Proceedings of the National Academy of Sciences, 118(44).

I skimmed through this paper.

This study uses high-resolution satellite images to identify forest smallholders and forest areas occupied by them. The authors focus on Gran Chaco, a deforestation hotspot located in South America. The analysis is for the period of 1985-2015. The authors found a general decrease in the forest resources in the surroundings of forest smallholders' homesteads.

## 2021-12-02 #

1. Finished Master et al. (2021)

I skimmed through this paper.

This study combined results from surveys and experiments to demonstrate the (1) existence and the (2) effect of gender stereotypes regarding computer science and engineering abilities among children and adolescents.

Results from the two surveys show that children as young as six and adolescents endorse the stereotype that boys are better at computer science and engineering than girls. The results also show that for individual girls, the more she endorses this stereotype, the less interested she is and the lower sense of belonging she has in the fields.

The experiments showed that girls are less interested in an activity labeled with this kind of gender stereotype compared with an activity without such a label.

1. Hannák, A., Wagner, C., Garcia, D., Mislove, A., Strohmaier, M., & Wilson, C. (2017, February). Bias in online freelance marketplaces: Evidence from taskrabbit and fiverr . In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing (pp. 1914-1933).

I skimmed through this paper.

The study showed that on TaskRabbit and Fiverr, two freelance marketplaces, gender and race significantly correlate with the number and nature of feedback workers received. There also existed a search ranking bias.

This paper only has three levels for the variable of race: White, Asian, Black. White about Latinx? I know it will be very difficult to distinguish between White and Latinx based on workers' profile, though. I am just wondering whether adding “Latinx” as a level in race changes the results.

## 2021-12-01 #

1. McDermott, A. (2021). News Feature: What was the first “art”? How would we know? . Proceedings of the National Academy of Sciences, 118(44).

The article talks about how to define art and how to know what was the first artistic piece.

1. Master, A., Meltzoff, A. N., & Cheryan, S. (2021). Gender stereotypes about interests start early and cause gender disparities in computer science and engineering . Proceedings of the National Academy of Sciences, 118(48).

PP. 1-4

# 2021-11 #

## 2021-11-30 (Completed on 2021-12-01) #

1. Shah, H. (2020). Global problems need social science . Nature, 577(7789), 295-296.

The challenges we are facing need not only data scientists, but also scientists in the field of social sciences and humanities.

1. Franks, N. P., & Wisden, W. (2021). The inescapable drive to sleep: Overlapping mechanisms of sleep and sedation . Science, 374(6567), 556-559.

We know a lot about why we need food, water and sex but we still do not know whhy we need sleeping. The authors give a nice review of what we know about sleeping so far. The authors argue that we sleep because we need an unconscious brain to restore our body.

## 2021-11-29 (Completed on 2021-11-30) #

1. Zhou, C., Sylvia, S., Zhang, L., Luo, R., Yi, H., Liu, C., … & Rozelle, S. (2015). China’s left-behind children: impact of parental migration on health, nutrition, and educational outcomes. Health affairs, 34(11), 1964-1971.

The study found that in rural China, left-behind children do as well as, or slightly better than, children living with both parents, in terms of health, nutrition, and academic performance (Chinese, Math, and English). This result indicates that special programs helping rural children in China should not be restricted to left-behind children only.

The authors estimate that there are over 73 million left-behind children in rural China.

1. Li, W., & Keene, A. C. (2021). Flies sense the world while sleeping . Nature News

This piece reviews a study which finds that when starved, fruit flies will wake up when they sense odors of food.

## 2021-11-28 #

1. Finished Athey et al. (2021)

I skimmed through this paper.

This study compares residential racial isolation with “experienced isolation” based on mobile phone GPS data. The key difference between the measures is that the first one is about location of homes whereas the second one tracks the movement of people. The key findings are (1) experienced isolation is lower than residential isolation, and (2) the two are highly correlated.

👍 2. Fochesato, M., Higham, C., Bogaard, A., & Castillo, C. C. (2021). Changing social inequality from first farmers to early states in Southeast Asia . Proceedings of the National Academy of Sciences, 118(47).

I skimmed through this paper.

The authors studied social inequality measured by Gini coefficients (GCs) which were computed based on burials in Northeast Thailand spanning 2500 years. The data shows that there were three periods of increased social inequality. The authors argue that in the last period of these three, there was a transition from dry-rice to wet-rice. This transition might have contributed to wealth inequalities and a rise in state societies.

## 2021-11-27 #

1. Finished Chu et al. (2021)

The study found that among American Christians who are hesitant to get vaccination, knowing that medical experts share religious belief with them increases their intention to get vaccinated and makes them want to encourage others to get vaccination.

1. Athey, S., Ferguson, B., Gentzkow, M., & Schmidt, T. (2021). Estimating experienced racial segregation in US cities using large-scale GPS data . Proceedings of the National Academy of Sciences, 118(46).

PP. 1-6

## 2021-11-26 #

1. Yu, H., Xue, L., Barrangou, R., Chen, S., & Huang, Y. (2021). Opinion: Toward inclusive global governance of human genome editing . Proceedings of the National Academy of Sciences, 118(47).

This is an opinion piece regarding regulation of human genome editing. The authors argue that the global government of gene editing should be inclusive: voices of representatives from marginalized countries and regions should be heard and seriously considered.

1. Chu, J., Pink, S. L., & Willer, R. (2021). Religious identity cues increase vaccination intentions and trust in medical experts among American Christians . Proceedings of the National Academy of Sciences, 118(49).

PP. 1-2

## 2021-11-25 #

Hangartner, D., Kopp, D., & Siegenthaler, M. (2021). Monitoring hiring discrimination through online recruitment platforms. Nature, 589(7843), 572-576.

This paper traces the searching and browsing behavior of Swiss recruiters on the public recruiting platform. The analyses based on the rich data show that (1) there is apparent discrimination based on ethinicity in terms of the contact likelihood and time spent on viewing the profile; (2) overall, there is no sign of discrimination based on gender but there is variation across different occupations.

## 2021-11-24 (Completed on 2021-11-25) #

Slonim, N., Bilu, Y., Alzate, C., Bar-Haim, R., Bogin, B., Bonin, F., … & Aharonov, R. (2021). An autonomous debating system. Nature, 591(7850), 379-384.

I skimmed through this paper.

A debating system based on AI was developed to debate with humans. This system breaks the big task, i.e., debating, down into smaller tasks, and then solves them one by one.

## 2021-11-23 (Completed on 2021-11-24) #

1. Ortega, R. P. (2021). Divided we sleep . Science, 374(6567), 552-555.

Non-White people in the US suffer more sleep problems. Causes might include discrimination, night shifts, light pollution, air pollution, and stress.

This is a viewpoint piece. It argues that climate change and the loss of biodiversity are linked to each other and the United Nations is trying to solve the two challenges together.

## 2021-11-22 #

1. Thorp, H. H. (2021). Time to unfriend Facebook? . Science

Thorp argues that scientists should be active Facebook to compete with key figures in the antiscience world. I am not sure whether this is a good or practical idea. Why? Because using Facebook is very time consuming. I don’t think scientists doing research have the time for that.

1. Normile, D. (2021). A greener path . Science

I don’t question the objectiveness of this report on Belt and Road Initiative (BRI). However, I don’t think it captures the whole picture. Reading this feature article gives me the impression that BRI is doing all bad things. It does appreciate its recent going green but the overall message is that BRI is bad.

I don’t think it’s good to damage the environment. But just as how Putin commented on Greta Thunberg, the famous Swedish environmental activist, the author of this paper might not have thought about how these BRI projects might improve the livelihood of people in impoverished countries.

It is easy to criticize something but hard to give constructive suggestions.

## 2021-11-21 #

1. Jessoe, K., & Moore, F. C. (2021). The cost of changes in energy use in a warming world . Nature News

This is a review piece. It reviews a paper that models the effects of climate change on energy consumption across the globe. The result shows that during the remaining years of the 21st century, with global warming, the rise in electricity use will be offset by the decrease in fuel use. The result will be a modest decrease in energy consumption. It should be noted that the paper fails to include some important factors in their model. For example, the price of air conditioning or electricity might decrease.

1. Willoughby, P. R. (2021). Early Africans living inland collected unusual objects . Nature, 592(7853), 193-193.

This article reviews a paper that talks about new archaeological findings in South Africa. This new finding indicates that people in the site might have already started using ornamental objects.

1. Saçma, M., & Geiger, H. (2021). Exercise generates immune cells in bone . Nature News

This article reviews a paper that shows how exercising boosts the immune system.

## 2021-11-20 #

1. Sadowski, J., Viljoen, S., & Whittaker, M. (2021). Everyone should decide how their digital data are used—Not just tech companies . Nature

This is a comment piece. The authors propose that academics (especially computational social scientists), government, and the public should take actions to make sure big corporate companies do not have significant monopoly over the data they have. Everyone should have a say over how the data can be used.

1. Ledford, H. (2020). How Facebook, Twitter and other data troves are revolutionizing social science . Nature, 582(7812), 328-331.

This is a feature story. It talks about the differences between traditional social science and the emerging computational social science.

## 2021-11-19 #

1. Rotimi, C. & Adeyemo, A. (2021). Expanding diversity in genomics . Nature News

This is a viewpoint piece. As the title indicates, the authors propose that we need more diversity in the participating individuals in genomics research. This is because knowledge found in one ethnic group may not apply to the other.

1. Van Noorden, R. (2020). The ethical questions that haunt facial-recognition research .

This is a news piece. It talks about ethical concerns in facial recognition research in particular and AI in general.

## 2021-11-18 #

1. Lauritzen, L. (2021). A spotlight on seafood for global human nutrition . Nature News

This article reviews a paper that uses modeling to estimate whether boosting the intake of seafood around the world might improve human health. The model indicates that it does. Countries in sub-Saharan Africa and southern Asia will benefit the most. In terms of the make-up of the population, women and children will benefit the most from a rise in seafood intake.

1. Jaeger, K. L. (2021). Most rivers and streams run dry every year . Nature News

This article reviews a study that shows (1) 51%-60% of all the streams in the world run dry for at least one day per year and (2) 44%-53% of worldwide stream length runs dry for at least one month per year.

Overall, the study indicates that non-perennial streams make up a major proportion of rivers and streams in the world and therefore call for more studies.

## 2021-11-17 #

Barrett, L. F. (2021). Debate about universal facial expressions goes big . Nature, 589(7841), 202-203.

This article reviews a study that shows there is universality in humans facial expressions. This study comes to this conclusion after using machine learning to analyze 6 million YouTube videos from 144 countries around the world. The article in the end shows some methodological flaws in the study, for example, there is no validation that the facial expression annotated by the human rater is in fact what the person in the video is experiencing, and the annotations are made only in English. Nonetheless, this paper opens up new methodological possibilities for similar studies in the future.

## 2021-11-16 (Completed on 2021-11-17) #

1. Preiner, M., & Martin, W. F. (2021). Life in a carbon dioxide world . Nature News.

The article reviews a paper that talks about how the bacteria called Hippea maritima uses the reversed oxidative TCA cycle when there is a high concentration of carbon dioxide. (TCA cycle is a pathway that converts sugars, fats and proteins into energy and carbon dioxide.)

1. Jongman, B. (2021). Fraction of population at risk of floods is growing . Nature News.

This article reviews a paper that uses a new dataset of floods in all continents except Antarctica to show that the percentage of people exposed to floods will continue to grow in the next decade. The estimation of the percentage of people vulnerable to floods based on this new data is ten times higher than the estimation based on previous data and methodology.

1. Yi, R. (2021). Relax to grow more hair . Nature News

This article reviews a study that shows chronic stress in mice reduced their hair growth. Injecting GAS6 into their skill restored the hair growth in mice even when they are still experiencing chronic stress.

## 2021-11-15 (Completed on 2021-11-16) #

👍 Kwak, H., Lee, C., Park, H., & Moon, S. (2010, April). What is Twitter, a social network or a news media? . In Proceedings of the 19th international conference on World wide web (pp. 591-600).

I skimmed through this paper. This study shows that on Twitter (1) follower distribution does not follow a power-law, (2) degree of separation is small, only 4.12, and (3) reciprocity is low. These characteristics deviate from known features of social network sites. This indicates that Twitter is more like a news sharing platform than a social network site.

## 2021-11-14 (Completed on 2021-11-16) #

1. Ellison, N. B., Steinfield, C., & Lampe, C. (2007). The benefits of Facebook “friends:” Social capital and college students’ use of online social network sites . Journal of computer‐mediated communication, 12(4), 1143-1168.

I skimmed through this paper. The study finds a positive relationship between Facebook use and the formation and maintenance of social capital.

1. Boyd, D. M., & Ellison, N. B. (2007). Social network sites: Definition, history, and scholarship . Journal of computer‐mediated Communication, 13(1), 210-230.

I skimmed through this paper. It reviewed the history of social network sites worldwide and talks about the scholarship in this field.

## 2021-11-13 (Completed on 2021-11-15) #

1. Garry, R. F. (2021). Ebola virus can lie low and reactivate after years in human survivors . Nature News

An infection of Ebola virus kept dormant for five years before reawakening.

1. Mehrabi, Z. (2021). How to buffer against an urban food shortage . Nature News

Adding diversity to city food supply chains can buffer against crises of food shortage (i.e., food shocks).

1. Warren, M. E. (2021). A bridge across the democracy–expertise divide . Nature News

An algorithm was created to randomly select people to form a minipublic to bridge the gap between democracy and expertise.

## 2021-11-12 (Completed on 2021-11-13) #

1. Skimmed through Goel et al. (2016) .

2. Spiers, H. (2021). Brain rhythms on the border . Nature News.

Theta oscillations in our brain’s medial temporallobe (MTL) increase when we or the people we are watching are nearing boundaries, for example, approaching the edge of a cliff.

## 2021-11-11 #

Goel, S., Anderson, A., Hofman, J., & Watts, D. J. (2016). The structural virality of online diffusion . Management Science, 62(1), 180-196.

PP. 1-7

## 2021-11-10 (Completed on 2021-11-11) #

Juul, J. L., & Ugander, J. (2021). Comparing information diffusion mechanisms by matching on cascade size . Proceedings of the National Academy of Sciences, 118(46).

The study re-examined the results of two landmark studies. The first one studies the diffusion of fact-checked false and true news on Twitter. The second compares the diffusion of news, videos, images, and petitions on Twitter.

The authors of this paper show that when we control for the cascade size (i.e., the number of retweets), the structural and temporal differences in the first study disappear, whereas these differences in the second study remain. This indicates that when the items in diffusion are of the same type, controlling for cascade sizes explains their structural and temporal differences in diffusion. When they are of different types, controlling for cascade sizes does not explain their structural differences in diffusion patterns.

## 2021-11-09 (Completed on 2021-11-10) #

Finished Azuma (1997) .

As a graduate student, you are on the bottom of the academic totem pole. Even undergraduates can rank higher, especially at private universities (because they actually pay tuition!) You cannot order anybody to do anything.

• As a PhD student, you must have initiative. Don’t wait for your advisors to tell you what to do for the next step.

• Respect others' time. Minimize their burden when asking help from them.

• Pay attention to your physical health.

## 2021-11-08 (Completed on 2021-11-10) #

Continued with Azuma (1997) .

PP. 3-9

## 2021-11-07 (Completed on 2021-11-09) #

1. Almeida‐Souza, L., & Baets, J. (2012). PhD survival guide: Some brief advice for PhD students . EMBO reports, 13(3), 189-192.
• Do not assume that your boss knows about what is going on in your projects and in your life. Your boss has his or her own concerns and is supervising a lot of people. When you face problems, just talk to them to find a solution and don’t wait.

1. Azuma, R.T. (1997). A graduate school survival guide: “So long, and thanks for the Ph.D!"

PP. 1-3

## 2021-11-06 (Completed on 2021-11-07) #

1. Olm, M. R., & Sonnenburg, J. L. (2021). Ancient human faeces reveal gut microbes of the past .

Wibowo et al’s study analyzed the DNA of gut microbes found in ancient (1000-2000 yrs old) human faeces. The analysis showed that these ancient gut microbes are similar to those of modern non-industrialized populations.

1. Ma, K. C., & Lipsitch, M. (2021). Big data and simple models used to track the spread of COVID-19 in cities .

Chang et al’s study uses mobility data of 98 million people in the United States to model Covid-19 cases in multiple cities. Key findings:

• Infections in “restaurants, gyms and religious establishments” have a larger role in the pandemic;
• Compared to high-income neighborhoods, low-income neighborhoods had a smaller decline in mobility during lockdowns.
• Places visited by people from low-income neighborhoods are more crowded than those visited by people from high-income neighborhoods.

## 2021-11-05 (Completed on 2021-11-07) #

1. Weng, C. H., & Rogers, J. R. (2021). An AI tool to make clinical trials more inclusive . Nature, 592(7855), 512-513.

Liu et al’s study uses an AI approach to enlarge clinical trial pools.

1. Obermeyer, Z. (2021). A machine-learning algorithm to target COVID testing of travellers .

Bastani et al’s study uses machine learning to help Greece border agents decide which travellers to test for COVID-19 given the fact that it’s impossible to test all of them.

## 2021-11-04 (Completed on 2021-11-07) #

1. Arcaute, E. (2020). Hierarchies defined through human mobility .

Alessandretti et al’s paper reconciles two contradictory findings: humans travel across certain spatial scales vs. human mobility has no scales. The point is that human mobility has hierarchies, represented by different levels of “containers”. When each container is viewed separately, the container size follows a log-normal distribution. However, if we aggregate all containers, the container size follows a power-law distribution.

1. Hoffmann, S. (2021). Lend an ear to a classic tale of mammalian evolution .

Want et al’s study , which is based on a new fossile (160 million yrs old) discovered in China, updates mammalian evolution of the middle ear.

## 2021-11-03 (Completed on 2021-11-04) #

1. Laschi, C., & Calisti, M. (2021). Soft robot reaches the deepest part of the ocean . Nature News.

Li and colleagues designed a soft robot that can withstand the pressure in the Mariana Trench, the deepest place on Earth.

1. Huentemeyer, P. (2021). Hunting the strongest accelerators in our Galaxy . Nature News.

An observatory in China (LHAASO) reported potential candidates for PeVatron, the strongest particle accelerators in our Galaxy.

1. Patel, M. S. (2021). Text-message nudges encourage COVID vaccination. . Nature News.

Text nudges should (1) make the behavior easier; (2) motivate people, by, for example, invoking ownership of something; and finally (3) make people act right now.

## 2021-11-02 (Completed on 2021-11-03) #

1. Hein, J. (2021). Machine learning made easy for optimal reactions . Nature News
• An open-sourced machine learning toolkit is built to help chemists optimize reaction conditions. The result of the competition game between experts and the algorithm is that although experts made better initial choices, the algorithm outperformed the experts after the 3rd trial.
1. Roca, A. (2021). A mammoth step back in genomic time

This news piece introduces a paper that uses gene sequencing to uncover the evolutionary history of mammoths.

1. Rajeswaran, P. & Orsborn, A.L. (2021). Neural interface translates thoughts into type . Nature News.

This news article introduces Willett et al’s work .

## 2021-11-01 (Completed on 2021-11-03) #

1. Nature Podcast of 07 July 2021. Food shocks and how to avoid them
• A higher level of supply chain diversity makes a city in the US less likely to experience food shocks.
1. Normile, D (2021). It’s official: China has eliminated malaria . Science.

# 2021-10 #

## 2021-10-31 #

1. Finished Galesic et al. (2021)

Human social sensing: to ask people about their social environments, for example, the thoughts and behavior of their social contacts.

Human social sensors can help us gain a more accurate understanding of the current and future societal trends.

1. Lisovski, S., & Liedvogel, M. (2021). A bird’s migration decoded . Nature News

## 2021-10-30 #

Galesic, M., Bruine de Bruin, W., Dalege, J., Feld, S. L., Kreuter, F., Olsson, H., … & van Der Does, T. (2021). Human social sensing is an untapped resource for computational social science . Nature, 595(7866), 214-222.

PP. 1-3

## 2021-10-29 (completed on 2021-10-30) #

Alpaslan-Roodenberg, S., Anthony, D., Babiker, H., Bánffy, E., Booth, T., Capone, P., … & Zahir, M. (2021). Ethics of DNA research on human remains: five globally applicable guidelines . Nature, 1-6.

The translations of this paper into more than 20 languages are available here .

• The rule of “consulting indigenous communities” which mostly applies to USa may not work for other countries, for example, those in Central and South America.

• Using ancient DNA research to establish group identity can be potentially extremely harmful.

• Five guidelines

1. Follow regulations in the places where they work and where they obtain human remains. When local regulations are insufficient, researchers should follow guidelines below.
2. Have a detailed plan before doing any study: research questions, DNA data to be used, techniques to be employed, where to store data, etc.
• responsibility is not transferable.
1. Minimize damages to human remains.
2. Make data public after publication.
3. Respect and be sensitive to other stakeholders' perspectives.
• Genetic data may be inconsistent with other forms of knowledge. When this occurs, researchers should never decrease the importance of traditional knowledge and long-held beliefs.

## 2021-10-28 (Completed on 2021-10-29) #

Finished Wagner et al. (2021)

### Problems #

• Insufficient quality of measurements

• Measurement models: tie theoretical constructs to observable data
• In social sciences, the constructs we are measuring, predicting, and explaining are unstable and they are influenced by the mere act of measuring, predicting, or explaining.
• The consequence of mis-measurements

• There might be a mismatch between the theoretical understanding of a construct and the measurement: what we measure is not what we want to measure
• How to anticipate the side effects of measurements in this “algorithmically-infused society”?
• The limits of existing social theories

• We need theories explaining the role of algorithms in societies.
• In social sciences, should research be theory-driven or data-driven? This is a question.

### Solutions #

• Possible solutions to above three problems: (1) develop responsible and trustworthy models for measurement; (2) mitigate negative effects of mismeasurements; and (3) develop “empirically informed theories”.

• How to improve measurements

1. Use different from different sources (e.g., self-report, mobile phone apps, experimental data, etc)
2. Have good guidelines. For example, to document, develop, and maintain measurement models.
• Measurement itself may affect the outcome. For example, ranking of popularity influences popularity itself.
• Responsible social science agenda

• Are the measurements “just and equitable”, “transparent and interpretable”, and “privacy preserving”?
• To integrate data and measurements in theory-building might be beneficial. This is more so at algorithmically infused societies where new phenomena emerge frequently which requires theories to be updated.

• We need to avoid “black box measurement models” which rely on unjustifiable assumptions, inaccurate logic and biased data.

• We need to consider the potential consequences of measurements. These consequences might affect individuals and societies, and might also be hard to identify and quantify.

## 2021-10-27 #

Wagner, C., Strohmaier, M., Olteanu, A., Kıcıman, E., Contractor, N., & Eliassi-Rad, T. (2021). Measuring algorithmically infused societies . Nature, 595(7866), 197-204.

PP. 1-4

## 2021-10-26 #

Yuste, R., Goering, S., Bi, G., Carmena, J. M., Carter, A., Fins, J. J., … & Wolpaw, J. (2017). Four ethical priorities for neurotechnologies and AI . Nature News, 551(7679), 159.

I like this piece. I believe some of the concerns brought up in it will be real.

• Privacy. Neural data is so personal. Imagine a hacker hijacking the BCI system you are using, “stealing” all of the thoughts and emotions you had since your childhood. He will know more about you than anybody in the world. I agree with the suggestion that access and sale of personal neural data be strictly regulated, much like sale of human organs.

• Identity. This is a very pressing concern. Imagine your brain is connected with many others. You don’t even know which thought is yours. Then it’s natural you lose your sense of identity. This can be spiritually enlightening or psychologically devastating, depending on your reaction.

• Augmentation. It’s so easy to have an “augmentation arm race”: people, or even countries, augment their abilities through nanotechnologies. This will exacerbate social inequality.

## 2021-10-25 #

Cowen, A. S., Keltner, D., Schroff, F., Jou, B., Adam, H., & Prasad, G. (2021). Sixteen facial expressions occur in similar contexts worldwide . Nature, 589(7841), 251-257.

The study studies universality of emotions by examining how people in 6 million videos from 12 world regions encompassing 144 countries express emotions in similar social contexts. The results show that the 16 types of emotions systematically occur in these videos. The authors show that there is 70% overlap in the association between social contexts and facial expression.

## 2021-10-24 #

Willett, F. R., Avansino, D. T., Hochberg, L. R., Henderson, J. M., & Shenoy, K. V. (2021). High-performance brain-to-text communication via handwriting . Nature, 593(7858), 249-254.

The authors design a brain-computer-interface (BCI) that can decode attempted handwriting movement from a paralyzed participant (T5) and translate it into texts in real time. The system achieved a typing speed of 90 words per minute with an accuracy rate of 94.1 online and more than 99% offline. This outperformed all previous BCI systems, for example, those that decode point-to-point movements.

I am wondering how the system works for decoding attempted handwriting of CJK (Chinese, Japanese, Korean).

## 2021-10-23 #

1. Reed, C. (2021). Argument technology for debating with humans .

A new study has made significant progress in argument mining.

1. Ledford, H. (2020). Social scientists battle bots to glean insights from online . Nature, 578(7793), 17-17.

Social media bots can potentially contaminate social media research. As bot detection advances, bots developers are also becoming skilled, making more sophisticated bots.

Analyzing personal data without consent poses challenges to society.

Algorithms influence people’s behavior. Therefore, big data not only shows patterns of human behavior, but also algorithms behind the behavior.

## 2021-10-22 #

👍 Wagner, C., Mitter, S., Körner, C., & Strohmaier, M. (2012, April). When Social Bots Attack: Modeling Susceptibility of Users in Online Social Networks . In # MSM (pp. 41-48).

The authors study the characteristics of Twitter users susceptible to social bots based on the data from Social Bot Challenge 2011. They find that susceptible users interact with more users (have a high out-degree), and tend to use Twitter for a conversational purpose rather than for an informational one.

The study is also very limited. For example, all the users had an interest in or had retweeted about cats. In addition, the dataset is too small: only around 400 users (76 susceptible and 298 non-susceptible). But I like this paper: it is short and clear.

## 2021-10-21 #

Finished Thaler (1999) .

• Reference price is important. We will consider a beer from a resort hotel worth more money than that from a grocery store. Sellers make use of this by telling consumers how much money they are saving relative to the regular price.

• If people have two stocks, one of which is increasing in value and the other one decreasing, they tend to sell the winner rather than the loser. This is not rational.

• Paying with credit cards means payment is (1) later than, and (2) separated from the purchase. This makes the payment less salient. Another thing making a payment less salient is that multiple bills arrive together.

• If you want to give someone a gift, you can give them something they wouldn’t buy for themselves.

• People go for variety when choosing multiple things in advance but will choose a limited category of things if asked to choose one at a time.

• If people are given one stock fund and one bond fund, they’ll invest 50% of their money in stocks. However, if given two stock funds and one bond fund, they’ll invest 75% of their money in stocks.

## 2021-10-20 #

Continued with Thaler (1999).

PP. 10-17

## 2021-10-19 #

Continued with Thaler (1999).

PP. 4-10

## 2021-10-18 #

1. Finished Johnson et al. (2020)

This study finds that although anti-vaccination clusters are small in size, they are more entangled with undecided clusters. In comparison, pro-vaccination is more peripheral.

The theory built by the authors predicts that anti-vaccination narratives will dominate the network in the next ten years.

1. Thaler, R. H. (1999). Mental accounting matters . Journal of Behavioral decision making, 12(3), 183-206.

PP. 1-4

## 2021-10-17 (completed on 2021-10-18) #

1. Oh, J., Hwang, A. H. C., & Lim, H. S. (2020). How Interactive Data Visualization and Users’ BMI (Body Mass Index) Influence Obesity Prevention Intentions: The Mediating Effect of Cognitive Absorption . Health Communication, 1-10.

The study finds that (1) people with a lower BMI are more involved in highly interactive data visualization, and (2) this greater level of involvement makes lower BMI people (but not high BMI folks) think this (i.e., obesity) is a serious issue.

1. Johnson, N. F., Velásquez, N., Restrepo, N. J., Leahy, R., Gabriel, N., El Oud, S., … & Lupu, Y. (2020). The online competition between pro-and anti-vaccination views . Nature, 582(7811), 230-233.

PP. 1-2

## 2021-10-16 #

1. Strohmayer, A., Clamen, J., & Laing, M. (2019, May). Technologies for social justice: Lessons from sex workers on the front lines . In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-14).

I skimmed through this paper.

The paper talks about a sex worker rights organization in Canada which maintains a bad client list. One interesting point the authors point out is that technologies are not the solution to complex social justice problems; they only support, or aid, the efforts.

1. Binns, R., Van Kleek, M., Veale, M., Lyngs, U., Zhao, J., & Shadbolt, N. (2018, April). ‘It’s Reducing a Human Being to a Percentage’ Perceptions of Justice in Algorithmic Decisions . In Proceedings of the 2018 Chi conference on human factors in computing systems (pp. 1-14).

I skimmed through this paper.

This paper is about how algorithmic decisions affect people’s perceptions of justice. The study finds that people consider justice-related issues when they encounter decisions made by algorithms. The study also finds that explanations may or may not affect how people feel about the fairness of the decisions, depending on the scenarios. The authors conclude that there might be no best approach to explaining algorithmic decisions.

## 2021-10-15 (completed on 2021-10-16) #

Strohmayer, A., Laing, M., & Comber, R. (2017, May). Technologies and social justice outcomes in sex work charities: Fighting stigma, saving lives . In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (pp. 3352-3364).

I skimmed through this paper.

This paper talks about NUM, a charity organization that helps sex workers in the UK. Sex workers join NUM to report and receive alerts of bad clients. The authors argue that HIC can help sex workers.

## 2021-10-14 (completed on 2021-10-15) #

1. Guo, P. (2012). The Ph.D. Grind: Lead From Below

As a subordinate, you need to know how to get support from your boss even when you are implementing your ideas, not theirs. It’s hard.

1. Kurenokov, A. (2020). Lessons Learned the Hard Way in Grad School (so far)
• Teamwork is useful and helpful. It may relieve some of your stress as working alone might not produce papers.

• You need to maintain your health.

## 2021-10-13 #

Finished Potvin & Levenberg (2016)

Google and its thousands of developers around the world have been using one single monolithic repository which contains one billion files and 35 million commits. Google used tools like Piper and CitC (Clinetns in the Cloud) to support using one monolithic repository.

Using a monolithic repository has benefits such as encouraging collaboration and easier dependency management. The drawbacks are that it requires investment to support such a huge repository and the codebase might become too complicated. However, the benefits outweigh the drawbacks and that’s why Google is not interested in splitting the repository.

## 2021-10-12 #

I skimmed through this thesis.

Abhraneel’s study tries to compare insights (operationalized as recall and comprehension) users obtain from visualizations with narratives and with interactivity (there are four conditions, see Fig. 1 on p.9). The results show narratives have weak effects and interactivity has barely any effects. The conclusion is that interactivity may not be a necessary component of information visualizations. This, however, by no means means that interactivity is always useless.

1. Potvin, R., & Levenberg, J. (2016). Why Google stores billions of lines of code in a single repository . Communications of the ACM, 59(7), 78-87.

PP. 1-5

## 2021-10-11 #

1. Haeussler, C., & Sauermann, H. (2020). Division of labor in collaborative knowledge production: The role of team size and interdisciplinarity. Research Policy, 49(6), 103987.

I skimmed through this paper. Didn’t understand it. Generally, larger/interdisciplinary teams have greater division of labor.

## 2021-10-10 #

Linxen, S., Sturm, C., Brühlmann, F., Cassau, V., Opwis, K., & Reinecke, K. (2021, May). How WEIRD is CHI? . In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-14).

I skimmed through this paper.

• The study finds that 73% of participants in CHI publications from 2016 to 2020 are from “WEIRD” (Western, Educated, Industrialized, Rich, and Democratic) countries where 97% come from countries that satisfy all the five “WEIRD” variables.

• Of 2,7688 papers, 2,611 had human participants. Of these 2,611, only 38.9% reported country origin of participants or allowed guessing based on the authors' country origin.

• Around 70% of all the participants are university students or have a university degree.

• Over 80% of authors recruit participants “in their own backyards.”

## 2021-10-09 #

Finished Tannenbaum et al. (2019)

The authors argue that sex and gender analysis is important for research of different fields, and propose that funding agencies, academic journals, and universities support incorporating sex and gender analysis into scientific research.

## 2021-10-08 #

Tannenbaum, C., Ellis, R. P., Eyssel, F., Zou, J., & Schiebinger, L. (2019). Sex and gender analysis improves science and engineering . Nature, 575(7781), 137-146.

PP. 1-5

## 2021-10-07 (completed on 2021-10-08) #

Chu, J. S., & Evans, J. A. (2021). Slowed canonical progress in large fields of science . Proceedings of the National Academy of Sciences, 118(41).

The paper first theorizes that as scientific fields get large, the top cited papers will only become more impactful, making new ideas hard to be found and cited. The authors then use data from Web of Science to test six predictions based on their theory. The data support the predictions.

## 2021-10-06 #

Yin, Y., Wang, Y., Evans, J. A., & Wang, D. (2019). Quantifying the dynamics of failure across science, startups and security . Nature, 575(7781), 190-194.

The study uses three data sets, i.e., NIH grants applications, startups, and terrorist attacks to test a simple one-parameter model that predicts failure & success. The model works pretty well. Key findings:

• Success is not due to chance.
• People who succeed do not necessarily try more times than their failed counterparts.
• Among successful people, there is significant improvement in the second attempt compared to the first one. This improvement is absent for the failed group.
• It is unnecessary to learn from all past failures.

## 2021-10-05 #

I skimmed through the whole book. Some suggestions I bore in mind:

• Don’t let graduate school kill your personal time. You need an adequate amount of sleep. You need to eat and drink well. You need to have a personal life.
• Especially in the first year, if you want to collaborate with two different professors, make sure to let them know. Otherwise, one of them will be surprised if you decline the opportunity to work with him/her. (p73)
• Set aside time each day, at least each week, for writing. Do it even though you don’t feel like doing it that day/week.
• Write one co-authored paper and one solo-authored paper each year in the first two years of your PhD.
• Mentorship is better than apprenticeship. The first means you work with more than one advisor to develop your own research agenda whereas the second means you work for only one advisor on his/her research project.
• Practice your research talk! You definitely need practice.
• R is good!

I disagree with Tom’s suggestions about websites (p.70). Tom says it is not very useful if you have a professional website at the start of your PhD as you don’t have much stuff to put on it. Rather, he suggests a website at the end of PhD when you are looking for a job.

I highly disagree with it. In my opinion, a homepage is much more than your CV. Yes, you might not have papers to put on your homepage, but there are millions of other things you can put. Write blogs! Document your daily tangible progress! Publish your projects! Introduce yourself! These activities will (1) let you know what you lack, and (2) motivate you to make progress each and every day.

## 2021-10-04 #

Finished Chen at al. (2021)

The authors initiated a few neutral bots on Twitter, called “drifters”. These drifters are the same except for the political alignment of the initial account they follow. The key findings of this paper:

1. The information drifters receive is largely dependent on the initial accounts they follow.
2. Most drifters find themselves in echo chambers. Echo chambers of conservative accounts are especially dense.
3. Right-leaning drifters encounter more low-credibility information.
4. The type of content a drifter receives is not manipulated by the platform, i.e., Twitter. Rather, it was largely dependent on the political alignment of their friends.

## 2021-10-03 #

1. 👍 Vespignani, A. (2018). Twenty years of network science . Nature.

The model of small-world and preferential-attachment still underpin our understanding of networks.

This piece is a super good explanation of the small-world paper by Watts and Strogatz (1998).

1. Chen, W., Pacheco, D., Yang, K. C., & Menczer, F. (2021). Neutral bots probe political bias on social media . Nature Communications, 12(1), 1-10.

PP. 1-3.

## 2021-10-02 #

1. A Letter to Research Students by Duane A. Bailey.
• Keep reading papers, and summarizing them. The summary will be very useful when (1) you introduce a paper to others, and (2) you reread it.

• Write well. Practice. Set time aside.

• Collaborate. Work with other people.

• “Rehearse and time your talks.”

• The people you will be working with are the most important factor when you decide which program to attend.

1. How to evaluate an advisor by Jason Eisner.

## 2021-10-01 (completed on 2021-10-02) #

McAfee, A., Brynjolfsson, E., Davenport, T. H., Patil, D. J., & Barton, D. (2012). Big data: the management revolution . Harvard business review, 90(10), 60-68.

Each of us is now a walking data generator.

Computers are useless. They can only give you answers. – Pablo Picasso

# 2021-09 #

## 2021-09-30 (completed on 2021-10-02) #

👍 1. Finished West et al. (2021)

This paper analyzed attention cycles for deceased public figures based on data collected from (1) news on the web and (2) Twitter mentions. The authors find that short-term boosts last for around 30 days within the death. On average, long-term attention boosts are largest for artists and smallest for leaders. It might be because artists have legacies that people can enjoy even after they die.

1. McAfee, A. (2011). What every CEO needs to know about the cloud . Harvard business review, 89(11), 124-132.

I didn’t know that Box has been around for a decade.

This article has been almost ten years old. Looking back, I didn’t feel that cloud computing has revolutionized the whole industry nor people’s lives. Maybe it’s only because I didn’t notice it?

## 2021-09-29 #

West, R., Leskovec, J., & Potts, C. (2021). Postmortem memory of public figures in news and social media . Proceedings of the National Academy of Sciences, 118(38).

PP. 1-5

## 2021-09-28 #

1. Ivancevic, A. (2018). Done is better than perfect: overcoming PhD perfectionism
• If you are too perfectionist, you’ll probably never publish anything.
• Don’t expect your PhD thesis to be perfect. If you do, you’ll never be able to graduate.
• If you don’t understand someone else’s work, then it’s their fault.
• You have to figure out on your own why your work is important.
• Never expect your thesis to be perfect. There is no perfect thesis.
• Be open to job opportunities outside of your PhD. This prevents you from losing your independence. If you know you have other options outside of your research, you are better off.
• Avoid taking lectures.
• “To learn to think, you need two things: large blocks of time, and as much one-on-one interaction as you can get with someone who thinks more clearly than you do.”
• “Write a proposal and get it criticized.”
• Start publishing papers early on.
• Publish regularly but not too often. Uncited papers are a waste of time and energy.

## 2021-09-27 (completed on 2021-09-28) #

1. Cirik, V. (2019). PhD 101
• Research involves lots of uncertainty and failures. Be prepared.
• Don’t equate your self-worth to research ideas you have.
• The narrative you see in papers sound like their ideas flow smoothly. In reality, it’s never going to happen. There will be many trials and errors and many “dead ends”.
• Success is not equal to great personality. Don’t be surprised if you know terrible things happening in your research community.
• PhD should definitely not become the whole of your life! You should have a life outside of your PhD!
1. Taylor, L. (2018). Twenty things I wish I’d known when I started my PhD
• (About work-life balance & mental health)

• Having a work-life balance by doing daily routines that work for you. Taking good care of yourself is key to your success.

• Have a life outside your work. PhD should not become your whole life.

• Don’t compare yourself with others.

• Attend seminars and lab-group meetings. Things you’ll hear might change your research trajectory completely.

• Back up. Back up. Back up.

• Write down everything. Don’t count on your memory.

• You need completed work, not perfect work.

• Discuss expectations with your supervisor.

• Be honest with your supervisor.

• Don’t struggle alone. Ask help if you need.

## 2021-09-26 (completed on 2021-09-28) #

1. Kreiner, J. (2021). How to reduce digital distractions: advice from medieval monks

1. Fiesler, C. (2019). Why (and how) academics should blog their papers

1. Fiesler, C. (2018). What Our Tech Ethics Crisis Says About the State of Computer Science Education

Ethics should not be separated from software engineering. You code, and then you are responsible for your codes.

Even scientists might consider negative impacts their studies might have.

## 2021-09-25 #

👍 Bromham, L., Dinnage, R., & Hua, X. (2016). Interdisciplinary research has consistently lower funding success . Nature, 534(7609), 684-687.

The authors examined the relationship between (1) the degree of interdisciplinarity of research proposals and (2) the success rate of this proposal being approved of.

The dataset consists of all proposals, both successful and unsuccessful, submitted to the Australian Research Council Discovery Programme from 2010 to 2014.

Success rate is easy to calculate but “interdisciplinarity” of a proposal is not. Researchers used to measure “interdisciplinarity” by checking specific words like “interdisciplinary” or through bibliometric analysis. These two measures have many disadvantages. The authors here propose a new measure called interdisciplinary distance (IDD).

The results show that interdisciplinary proposals have lower success rates and this is not because universities with higher success rates submit more narrowly focused research proposals.

## 2021-09-24 (completed on 2021-09-25) #

• Treat a PhD like a job. You are not a student anymore. Don’t wait for others to tell you want to do.

• Apply for as many grants as your time allows you to do so.

1. Casey Flesler (2019). Advice for New PhD Students: Your Research Career is a Long Game
• Read papers like you’ll be able to cite it in the next 20 years instead of right now.

• Don’t write papers just to be accepted by a conference. Write it like it will be a start of your whole new research path.

• Don’t compare yourself with others.

• If you don’t have any free time at all during your PhD, something must be wrong.

1. Li, F-F. (2009). De-Mystifying Good Research and Good Papers

Not enough people conduct first class research. And not enough people write good papers.

[Y]our research topic should have many ‘customers’, and your solution would be the one they want to use.

• You should aim to let your paper inspire many other papers in the future to follow and cite you.

## 2021-09-23 (completed on 2021-09-24) #

👍 Im, J., Chandrasekharan, E., Sargent, J., Lighthammer, P., Denby, T., Bhargava, A., … & Gilbert, E. (2020, July). Still out there: Modeling and identifying russian troll accounts on twitter . In 12th ACM Conference on Web Science (pp. 1-10).

The authors used a machine learning model (logistic regression) to identify Russian Troll Twitter accounts based on the dataset released by Twitter of potential Troll accounts. This model is able to distinguish troll accounts from randomly selected Twitter accounts with good precision. Adjusted by human verification, the authors argue that around 2.6% of examined accounts on Twitter are Russian Troll accounts.

The authors also find that those troll accounts do not behave like bots.

## 2021-09-22 #

Finished Advice for early-stage Ph.D. students by Philip Guo. Find my summary of it here .

## 2021-09-21 #

Advice for early-stage Ph.D. students by Philip Guo.

PP. 16-27

## 2021-09-20 (completed on 2021-09-21) #

Advice for early-stage Ph.D. students by Philip Guo.

PP. 8-15

## 2021-09-19 (completed on 2021-09-20) #

Advice for early-stage Ph.D. students by Philip Guo.

PP. 1-7

## 2021-09-18 #

👍 Ruder S. (2020). 10 Tips for Research and a PhD

• Look beyond your immediate interest. Papers that connect different topics and even different fields are insightful.
• It’s better to read 10 papers superficially than to read one paper in depth. With a paper management system, you can always go back and read more deeply.
1. Work on two things.

• If you only have one project in hand, then : 1) when the project fails, you lose motivation; 2) when the project comes to a standstill, you have no choice but to keep grinding. In comparison, if you have two projects, you can work on the other one if one project is stagnant, which gives you a rest and maybe a new perspective.

• If the two projects are in different fields, then it’s better to focus on one project a day.

2. Be ambitious.

When you have two projects going on, one can be a safe one and you can take risks with the other one. The risky one might have more impact if it succeeds. Work on something that excites you. Challenge the status quo.

3. Collaborate.

• If you collaborate with others remotely, it’s important to “communicate clearly and to set expectations”.

• Be open in terms of collaboration. Work with something different than your supervisor. Collaborate with people from a different university than your own.

4. Be proactive.

Seek out collaborations, meetings, and advice actively. Talk with people.

5. Write a blog.

Writing a blog helps you develop your communication skills.

6. Keep a source of positive energy.

• Work on things that excite you. If you cannot decide things to work on, try to find an angle that excites you.

• Establish a support network that you can count on. Surround yourself with positive people.

• Find something that you can fall back on when things don’t go well: exercise, a side project, meditation, blogging, to name just a few. Always remember that your well-being is the single most important thing.

This section is very inspiring for me personally. I come from a non-CS background and should use this as a strength. A different perspective on and/or approach towards problems is my strength.

8. Intern or visit a university.

• If you are deciding between the industry and academia, an internship or a research visit can give you insights.

9. Play the long game.

Be nice 😎.

## 2021-09-17 #

👍 Liu, L., Wang, Y., Sinatra, R., Giles, C. L., Song, C., & Wang, D. (2018). Hot streaks in artistic, cultural, and scientific careers . Nature, 559(7714), 396-399.

This paper is significant because it reconciles two opposing models of individuals' career productivity. The first model is the “Matthew effect” which believes that there is a unique period when a person performs consistently better than in other periods. The second model argues that the best work is purely random and is driven by high productivity.

The hot-streak model proposed by this paper argues that peak-performance period occurs randomly in a person’s career, but high-impact work is more likely to occur during this hot-streak period.

The study begins by investigating the probability of each of the three most impactful work ($N*$, $N**$, and $N***$) by an artist, film director, and a scientist and finds that each of the three works by an individual occurs randomly. But when the authors looked at the distance between individuals' most impactful work, they found that high-impact work tends to occur together. That is to say, there is a temporal correlation between highly impactful work for an individual.

Some other key findings in this paper:

1. All three careers, i.e., artists, directors, and scientists have peak-performance periods, but the hot-streak pattern varies across careers. See Figure 2 for details.

2. Hot streaks rarely happen more than once.

3. For the three careers considered in this paper, a hot streak usually lasts 3-6 years.

4. People are not more productive during a hot streak period.

## 2021-09-15 #

Kim, S. Y. S., Lebovits, H., & Shugars, S. (2021). Building a Bigger Table: Networking 101 For Graduate Students .

• Socialize with those with similar experience and similar interests to yours.

• Whatever your rank or power is, you are able to make someone in your community feel welcome. Aim to find those who feel uneasy and alone, and try to make them feel welcome. Strive to establish a community, not only your personal network.

• Try to connect people, rather than simply increase your own social network.

• Reach out to senior scholars four weeks ahead of the conference and ask for their time to give you feedback on your work. If they don’t respond, don’t take it personally. Maybe they receive too many emails and are not able to respond to yours.

## 2021-09-14 #

Halpern, B. S., Longo, C., Hardy, D., McLeod, K. L., Samhouri, J. F., Katona, S. K., … & Zeller, D. (2012). An index to assess the health and benefits of the global ocean . Nature, 488(7413), 615-620.

I skimmed through this paper.

## 2021-09-13 (completed on 2021-09-14) #

Finished Lazer et al. (2021)

• Does the data we have measure the construct we want to study?

• Are the people we are studying representative of the population? Can we generalize our findings?

• Is it ethical to access & analyze the data that we have?

## 2021-09-12 #

1. Finished Malmgren et al. (2010)

2. Lazer, D., Hargittai, E., Freelon, D., Gonzalez-Bailon, S., Munger, K., Ognyanova, K., & Radford, J. (2021). Meaningful measures of human society in the twenty-first century . Nature, 595(7866), 189-196.

PP. 1-3

## 2021-09-11 #

1. Finished Gurrieri et al. (2021)

As the title indicates, alcohol reduces the distance between strangers, but not that between friends.

Under the backdrop of COVID-19, it might serve the public’s interests to limit the open hours of bars.

1. Malmgren, R. D., Ottino, J. M., & Amaral, L. A. N. (2010). The role of mentorship in protégé performance . Nature, 465(7298), 622-626.

PP. 1-2

## 2021-09-10 #

One year goal achieved!

What a year. I was working on YY’s COVID-19 visualization project a year ago and then busily prepared for my PhD application . Results turned out well. I moved to a new place, Madison, WI, to persue my study, in a different field from what I have been doing for the past 10 years. Not sure how long I can keep reading papers like this, but who knows. Fingers crossed.

A glance: 211 papers/reports/blogs read; 211 hours and 34 minutes collected.

1. Finished reading reports on South Korea, Indian, Singapore, and New Zealand, which are part of Fleming (2016) .

2. Gurrieri, L., Fairbairn, C. E., Sayette, M. A., & Bosch, N. (2021). Alcohol narrows physical distance between strangers . PNAS, 118(20).

PP. 1-1

## 2021-09-09 (completed on 2021-09-10) #

Fleming, N. (2016). Career guide: Asia-Pacific . Nature, 536(7617), S1-S1.

I finished reading China, Japan, and Australia.

## 2021-09-08 (completed on 2021-09-10) #

Finished Hofman et al. (2021)

• Social scientists focus on explanations that are often for causal relationships derived from theories. Computer scientists, on the other hand, emphasize predictions. As long as a model improves predictions, computer scientists is Okay with the model being super complex.

• What NHST, a method applied widely in social science research, does is that it tests whether the hypothesized effect derived from theories is not zero. If it is not zero, then researchers conclude these observations are not in conflict with the theory, so the theory can continue to be used as an explanatory tool.

However, NHST has the following flaws:

• NHST does not test directly the outcome of interest, nor the magnitude of effect. It only looks at whether the hypothesized effect based on a theory is non-zero.

• In reality, many effects are non-zero. Therefore, statements that a theory cannot be ruled out by data are not valid as they appear.

• Computer scientists use models to predict. It should be noted that the value of a model does not only depend on its absolute performance, but also its comparison with a baseline. For example, search data is highly correlated with flu numbers two weeks later, which is quite impressive. But it turns out that the same correlation can be found between CDC case numbers in previous weeks, and case numbers in future weeks.

## 2021-09-07 #

Hofman, J. M., Watts, D. J., Athey, S., Garip, F., Griffiths, T. L., Kleinberg, J., … & Yarkoni, T. (2021). Integrating explanation and prediction in computational social science . Nature, 595(7866), 181-188.

PP. 1-2

## 2021-09-06 #

1. Finished skimming through Conlen et al. (2019)

This paper analyzed audiences' log data from three interactive visualizations authored by Idyll, a markup language to create interactive content.

• My main takeaways from this papers are:

• People usually spend around 1-2 minutes on a visualization.

• If some contents are hidden behind a click, around half of the people will not click and see it. Maybe they don’t notice it or they are satisfied with high level information enough to skip details.

• Many people will quickly skim through all the content before they dive into details (if they ever do so).

• How I can further this work:

1. The function of accessing users' log data is restricted to visualizations made by the Idyll language. However, most interactive visualizations available online might be created by other languages. This means we cannot use this feature to study interactive visualizations in the wild.

2. I am thinking of a study where I ask participants to interact with many interactive visualizations and look at what features of visualizations (1) make people spend more time and (2) lead to better recall.

## 2021-09-05 #

Conlen, M., Kale, A., & Heer, J. (2019, June). Capture & analysis of active reading behaviors for interactive articles on the web . In Computer Graphics Forum (Vol. 38, No. 3, pp. 687-698).

PP. 1-7

## 2021-09-04 #

Schaefer, J. D., Hamdi, N. R., Malone, S. M., Vrieze, S., Wilson, S., McGue, M., & Iacono, W. G. (2021). Associations between adolescent cannabis use and young-adult functioning in three longitudinal twin studies . PNAS, 118(14).

I skimmed through this paper.

This study is a twin study. It shows that using cannabis during adolescence has a possible causal relationship with young adults' socioeconomic outcomes. However, cannabis use is no longer associated with negative emotional and cognitive outcomes once we consider familial factors.

## 2021-09-03 #

Haghtalab, N., Jackson, M. O., & Procaccia, A. D. (2021). Belief polarization in a complex world: A learning theory perspective . PNAS, 118(19).

I skimmed through this paper. The authors use machine learning theories to model how people form beliefs. This approach is different from the Bayesian one. A Bayesian model of human learning posits that people have a prior and update this distribution based on new data. An ML approach, on the other hand, allows people to discard one prior for another when the old one fails to explain the data.

I don’t fully understand the details of this paper, so I am not 100% sure of my interpretation above.

## 2021-09-02 #

Buntaine, M. T., Zhang, B., & Hunnicutt, P. (2021). Citizen monitoring of waterways decreases pollution in China by supporting government action and oversight . PNAS, 118(29).

I skimmed through this paper.

## 2021-09-01 #

Swencionis, J. K., Pouget, E. R., & Goff, P. A. (2021). Supporting social hierarchy is associated with White police officers’ use of force . PNAS, 118(18).

I skimmed through this paper.

For White patrol officers, the higher their SDO (social dominance orientation, which means maintaining social hierarchies), the more force they use in the service of their job.

# 2021-08 #

## 2021-08-31 (Completed on 2021-09-01) #

Finished Starck et al. (2021)

Instrumental rationale of diversity: diversity brings educational benefits.

Moral rationale: diversity involves intrinsic values such as justice.

White participants (students & caregivers) prefer and expect more positive outcomes at instrumentally motivated universities compared to moral motivated ones. Black participants prefer and expect the opposite. University admission officers are aware of, and share these preference and expectation differences.

Both texts from the official websites of universities and admission officers show that universities use instrumental rationales to a greater extent than moral ones.

Regression models show that when texts from official websites show low morality, high instrumentality correlates with worse graduation rates for Black students.

The results of this study show that the common practice of diversity among universities favor White Americans rather than racial minorities.

## 2021-08-30 #

Starck, J. G., Sinclair, S., & Shelton, J. N. (2021). How university diversity rationales inform student preferences and outcomes . PNAS, 118(16).

PP. 1-4

## 2021-08-29 #

Turkel, E., Saha, A., Owen, R. C., Martin, G. J., & Vasserman, S. (2021). A method for measuring investigative journalism in local newspapers . PNAS, 118(30).

The authors use a classifier to identify investigative reporting, and measure how investigativeness they are, among 5.9 million articles in the past 10 decade (2010-2020) produced by a selection of 50 newspapers located in different regions of the US. They find that despite the widespread changes recently in the news industry, the quality of investigative reporting in US local newspapers was stable. That said, there is a decline in the past two years, possibly because of newsroom layoffs.

## 2021-08-28 #

Finished Gordon & Mendes (2021)

The authors use smartphone apps to measure participants' blood pressure and heart rates along with their self-reported experiences in daily life three times a day.

They find that when people are experiencing stress, their blood pressure (both systolic and diastolic) will be significantly higher than their baseline level. Resources people have to cope with stresses, such as financial, and interpersonal networks, have a role here as well. The more resources people have, the lower their blood pressure will be when experiencing stress.

[S]tress is best understood in its relation to the resources that individuals have.

There is a typo on p.5: “so these data are limited in how well then can characterize how stress and emotion affect BP reactivity in the very old (80 and older)”.

“Then” here should be “they”.

## 2021-08-27 (completed on 2021-08-28) #

1. Finished Salon et al. (2021)

2. Gordon, A. M., & Mendes, W. B. (2021). A large-scale study of stress, emotions, and blood pressure in daily life using a digital platform . PNAS, 118(31).

PP. 1-4

## 2021-08-26 #

1. Finished Mirza et al. (2021)

The authors used nighttime light (NTL) which is remotely sensed as a proxy for economic inequality across and within countries in the world. As can be seen in Fig. 1A and 1B, light Gini estimate (2010) almost overlaps with income Gini estimate (2010). Statistical analysis also shows that light Gini for high income countries is significantly lower than that for low and middle income nations.

Finer resolutions of NTL shows that the areas with high inequality are: eastern China, Southern Africa, coastal areas of the USA (e.g., California and Texas), northern Egypt, and central Brazil.

1. Salon, D., Conway, M. W., da Silva, D. C., Chauhan, R. S., Derrible, S., Mohammadian, A. K., … & Pendyala, R. M. (2021). The potential stickiness of pandemic-induced behavior changes in the United States . PNAS, 118(27).

PP. 1-1

## 2021-08-25 #

1. Mueller, H., Groeger, A., Hersh, J., Matranga, A., & Serrat, J. (2021). Monitoring war destruction from space using machine learning . PNAS, 118(23).

I skimmed through this paper. The authors show that machine learning can be used to monitor war destruction of urban infrastructure.

1. Mirza, M. U., Xu, C., van Bavel, B., van Nes, E. H., & Scheffer, M. (2021). Global inequality remotely sensed . PNAS, 118(18).

PP. 1-1

## 2021-08-24 #

Zhao, Y., Wang, Z., Shen, Z. J. M., & Sun, F. (2021). Assessment of battery utilization and energy consumption in the large-scale development of urban electric vehicles . PNAS, 118(17).

I skimmed through this paper.

• In Beijing, long-distance EVs do not necessarily use a large portion of the vehicle’s available energy. Also, due to technological constraints, around 35% of battery energy of EVs cannot be used.

• It’s better not to develop EVs so fast in cities with extremely high or low temperatures.

## 2021-08-23 (Completed on 2021-08-24) #

Skimmed through the rest of Clayton et al. (2021)

For Trump supporters, exposure to his norm-violating statements regarding election frauds decreases their trust in elections. But exposure to these statements increases trust in elections among Trump disapprover.

## 2021-08-22 #

1. Skimmed through the rest of Chaabouni et al. (2021)

2. Suran, M. (2021). News Feature: Keeping Black students in STEM . PNAS, 118(23).

We need resource allocation and related policies to increase diversity in STEM.

1. Clayton, K., Davis, N. T., Nyhan, B., Porter, E., Ryan, T. J., & Wood, T. J. (2021). Elite rhetoric can undermine democratic norms . PNAS, 118(23).

PP. 1-1

## 2021-08-21 #

1. Skimmed through the rest of Nanakdewa et al. (2021)

2. Chaabouni, R., Kharitonov, E., Dupoux, E., & Baroni, M. (2021). Communicating artificial neural networks develop efficient color-naming systems . PNAS, 118(12).

PP. 1-4

## 2021-08-20 (Completed on 2021-08-21) #

1. Nomi, T., Raudenbush, S. W., & Smith, J. J. (2021). Effects of double-dose algebra on college persistence and degree attainment . PNAS, 118(27).

Skimmed through this paper.

The authors find that providing additional training on algebra for incoming 9th graders with median skills in algebra, most of whom were minorities from low-income families, increased their life chances. For example, those who received double-dose on algebra stayed longer in college and were more likely to obtain college degrees.

The key to the success of this intervention is that (1) the schools strictly adhere to the policy, and (2) schools assign median-skills students to classes with similar or higher, but not lower, math skills.

1. Nanakdewa, K., Madan, S., Savani, K., & Markus, H. R. (2021). The salience of choice fuels independence: Implications for self-perception, cognition, and behavior . PNAS, 118(30).

PP. 1-1

## 2021-08-19 (Completed on 2021-08-21) #

1. Skimmed through the rest of Ellis et al. (2021)

This paper shows that around 75% of earth land was already inhabited and reshaped by human activity over 12,000 years ago. The recent loss of biodiversity should better be seen as the intensifying use of lands already reshaped by human activities.

1. Bird, M. I., Crabtree, S. A., Haig, J., Ulm, S., & Wurster, C. M. (2021). A global carbon and nitrogen isotope perspective on modern and ancient human diet . PNAS, 118(19).

I skimmed through this paper.

The isotope dietary breadth across industrialized nonsubsistence populations is only one thirds of that of pre-industrialization populations and subsistence populations in modern times.

## 2021-08-18 (Completed on 2021-08-21) #

1. Kraus, S., & Koch, N. (2021). Provisional COVID-19 infrastructure induces large, rapid increases in cycling . PNAS, 118(15).

I skimmed through this paper. The authors show that new provisional infrastructure for cycling led to an increase in cycling in more than 100 European cities.

1. Ellis, E. C., Gauthier, N., Goldewijk, K. K., Bird, R. B., Boivin, N., Díaz, S., … & Watson, J. E. (2021). People have shaped most of terrestrial nature for at least 12,000 years . PNAS, 118(17).

PP. 1-2

## 2021-08-17 (Completed on 2021-08-21) #

Persson, J., Parie, J. F., & Feuerriegel, S. (2021). Monitoring the COVID-19 epidemic with nationwide telecommunication data . PNAS, 118(26).

I skimmed through this paper.

The authors used telecommunication data obtained from swisscom to estimate the effects of COVID-19 policy measures in Switzerland on human bility from 10 February to 26 April 2020 and to test the predictive power of human mobility inferred from telecommunication data for reported COVID-19 cases. The results show that related measures reduced human mobility, and reduction in human mobility predicted decreased reported cases.

## 2021-08-16 #

McNulty, J. K., Meltzer, A. L., Neff, L. A., & Karney, B. R. (2021). How both partners’ individual differences, stress, and behavior predict change in relationship satisfaction: Extending the VSA model . PNAS, 118(27).

I skimmed through this paper.

Based on 10 independent longitudinal studies, this paper tests the Vulnerability-Stress-Adaptation (VSA) model to understand why marital satisfaction declines. The findings of this research support the central tenets of the VSA model but also propose some revisions to the model.

## 2021-08-15 #

1. Finished Western et al. (2021)

The authors analyzed jail admission records in New York City from 2008 to 2017 and came to three findings:

1. Black men are 8-20 times more likely to be put into jail compared to White men. Latino men are also at a higher risk of jail incarceration.

2. Jail incarceration is concentrated in high-poverty areas.

3. Cuts in the jail population during the 10 year period are associated with reductions in risks of jail incarceration for Black and Latino men.

4. Preston, S. H., & Vierboom, Y. C. (2021). Excess mortality in the United States in the 21st century . PNAS, 118(16).

I skimmed through this paper. The authors compared the US and five largest European countries in terms of mortality. Fig. 1A shows that the US has a higher mortality rate for most age ranges except for over age 85y.

Fig. 1 shows ratios. I guess they need to use a log scale??

## 2021-08-14 (completed on 2021-08-15) #

1. Kruczkiewicz et al. (2021). Opinion: Compound risks and complex emergencies require new approaches to preparedness . PNAS, 118(19).

2. Western, B., Davis, J., Ganter, F., & Smith, N. (2021). The cumulative risk of jail incarceration . PNAS, 118(16).

PP. 1-2

## 2021-08-13 #

2. Nyberg, L., Magnussen, F., Lundquist, A., Baaré, W., Bartrés-Faz, D., Bertram, L., … & Fjell, A. M. (2021). Educational attainment does not influence brain aging . PNAS, 118(18).

The finding of this paper challenges the belief that higher education translates into slower brain aging.

## 2021-08-12 #

1. Skimmed through the rest of Thöni & Volk. (2021)

The title captures the main findings well, that is, men have greater variability in time, risk, and social preferences.

1. Conley, D., & Johnson, T. (2021). Opinion: Past is future for the era of COVID-19 research in the social sciences . PNAS, 118(13).

2. Adam, D. (2021). Core Concept: Muography offers a new way to see inside a multitude of objects . PNAS, 118(14).

PP. 1-2

## 2021-08-11 #

1. Andersen, S. H., Richmond-Rakerd, L. S., Moffitt, T. E., & Caspi, A. (2021). Nationwide evidence that education disrupts the intergenerational transmission of disadvantage . PNAS, 118(31).

I skimmed through it.

Those who are at a health and social disadvantage tend to have both parents and children who are high-need users of health and social sectors. This means the disadvantage can be transmitted from the past generation and to the next generation.

Education disrupts the transmission from the previous generation and to the next generation. For people in the same generation, siblings with higher education have reduced risks for these disadvantages.

1. Thöni, C., & Volk, S. (2021). Converging evidence for greater male variability in time, risk, and social preferences . PNAS, 118(23).

PP. 1-3

## 2021-08-10 #

Asimovic, N., Nagler, J., Bonneau, R., & Tucker, J. A. (2021). Testing the effects of Facebook usage in an ethnically polarized setting . PNAS, 118(25).

I skimmed through it.

Deactivating Facebook for a week during a genocide commemoration in Bosnia and Herzegovina led to greater outgroup hostility.

## 20201-08-09 #

Riedl, C., Kim, Y. J., Gupta, P., Malone, T. W., & Woolley, A. W. (2021). Quantifying collective intelligence in human groups . PNAS, 118(21).

I skimmed through it.

## 2021-08-08 (Completed on 2021-08-09) #

Clauset, A., Larremore, D. B., & Sinatra, R. (2017). Data-driven predictions in the science of science . Science, 355(6324), 477-480.

A dangerous possibility is that funders, publishers, and universities will exploit large bibliographic databases to create new systems that automatically evaluate the future “impact” of project proposals, manuscripts, or young scholars. Such data-mining efforts should be undertaken with extreme caution.

A troubling trend, however, is that the nearly annual declaration by a Nobel laureate that their biggest discovery would not have been possible in today’s research environment. The 2016 declaration came from Ohsumi, who declared the fact that “scientists are now increasingly required to provide evidence of immediate and tangible application of their work”.

… a more reliable engine for generating scientific discoveries may be to cultivate and maintain a healthy ecosystem of scientists rather than focus on predicting individual discoveries.

## 2021-08-07 #

1. Finished Jiska (2018)

2. Finished Sharma, N. (2020). Two or three things I learned from being alone .

## 2021-08-06 #

1. Finished González-Bailón & De Domenico (2021)

Twitter and the web are different news ecosystems; unverified bots are not the cause for these differences.

1. Jiska (2018). Empowering AI to drive humanity’s moral evolution.

PP. 1-2

## 2021-08-05 #

1. Ornes, S. (2021). News Feature: The tricky challenge holding back electric cars . PNAS, 118(26).

Dendrites are the major problems facing future development of EVs.

1. González-Bailón, S., & De Domenico, M. (2021). Bots are less central than verified accounts during contentious political events . PNAS, 118(11).

PP. 1-4

## 2021-08-04 #

Finished Long et al. (2020)

## 2021-08-03 #

1. Masaki, H., Shibata, K., Hoshino, S., Ishihama, T., Saito, N., & Yatani, K. (2020, April). Exploring nudge designs to help adolescent sns users avoid privacy and safety threats . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-11).

I skimmed through it. The main findings are:

1. To nudge people to be aware of privacy and safety issues on social media, it’s better to use negative expressions such as “90% of people would not post images without permission” than to use affirmative ones like “10% of people would post images without permission”.

2. General suggestions such as “People would feel uncomfortable with this” also work.

3. Ashtari, N., Bunt, A., McGrenere, J., Nebeling, M., & Chilana, P. K. (2020, April). Creating augmented and virtual reality applications: Current practices, challenges, and opportunities . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-13).

I skimmed through it.

1. Long, D., & Magerko, B. (2020, April). What is AI literacy? Competencies and design considerations . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-16).

PP. 1-2

## 2021-08-02 #

Finished Kecht (2017)

I understand why the hedge fund manager (end of p. 18) believes he deserves the money: those who excel get more money. This idea makes sense when you only look at the “market” you are in. But when you look at the world from above, you will realize that the reason why you can be successful and earn so much money is not because you are hardworking (there are millions of people who are more hardworking than you are but failed); It’s because many others have helped you in one way or another and you are simply lucky. The moment you look down upon a poor man as not as hardworking as you are, you misunderstand how the world works.

PP. 1-15

# 2021-07 #

## 2021-07-31 #

Demeter, M. (2018). Nobody notices it? Qualitative inequalities of leading publications in communication and media research . International Journal of Communication, 12, 31.

I skimmed through this paper. The main conclusions of this paper are that (1) journals originated from developed countries mostly publish papers by authors from developed countries; (2) cross-cultural collaboration is not common in communication and media studies.

## 2021-07-30 #

People receiving personalized ads feel negative emotions, but they continue to disclose private information.

PP. 2-9

## 2021-07-29 #

1. 👍 Finished Bollen et al. (2021)

The authors analyzed 14 million books (Google Books) in English (US), German, and Spanish in 125 years (1855-2019) based on the prevalence of labeled cognitive distortion schemata (CDS) language patterns. As Fig. 2 shows, the prevalence of CDS language markers is much higher nowadays than 100 years ago. The increase starting from 2007/2008 is very obvious.

I am wondering whether the results are due to artifacts. In the past, the elite few were the sources of the authors. Nowadays, it’s becoming more common for ordinary people to publish. Therefore, books in the past were not good indicators of the general public. That said, this is a very impressive study. Since the data is freely available online for everyone, I can no longer use “lack of data” as an excuse for not being able to publish high quality research.

1. Hanson, J., Wei, M., Veys, S., Kugler, M., Strahilevitz, L., & Ur, B. (2020, April). Taking Data Out of Context to Hyper-Personalize Ads: Crowdworkers' Privacy Perceptions and Decisions to Disclose Private Information . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-13).

PP. 1-2

## 2021-07-28 #

1. Skimmed through the rest of Murray et al. (2020) . The paper is too technical for me.

2. Bollen, J., ten Thij, M., Breithaupt, F., Barron, A. T., Rutter, L. A., Lorenzo-Luaces, L., & Scheffer, M. (2021). Historical language records reveal a surge of cognitive distortions in recent decades . PNAS, 118(30).

PP. 1-4

## 2021-07-27 #

1. Skimmed through the rest of Chattopadhyay et al. (2020)

For researchers, it’s important that we support the language data scientists actually use, not the languages we wish they would use.

• Data scientists want version controls for their notebooks.

This article puts the section of Related Work to the end, which I think is pretty good! After all, why do we have to put literature review in the beginning?

1. Murray, D., Yoon, J., Kojaku, S., Costas, R., Jung, W. S., Milojević, S., & Ahn, Y. Y. (2020). Unsupervised embedding of trajectories captures the latent structure of mobility . arXiv preprint arXiv:2012.02785.

PP. 1-4

## 2021-07-26 #

1. Finished Simon (1969).

2. Chattopadhyay, S., Prasad, I., Henley, A. Z., Sarma, A., & Barik, T. (2020, April). What’s Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunitie s. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-12).

PP. 1-1

## 2021-07-25 #

Continued with Simon (1969).

Designing an intelligence system on the principle that attention is scarce and must be preserved is quite different from designing it on the principle of “the more information the better.”

The proper aim of a management information system is not “to bring the manager all the information he needs,” but to reorganize the manager’s environment of information so as to reduce the amount of time he must devote to receiving it.

Knowledge from the laboratory is not always cheaper——and frequently is much less reliable——than knowledge from life.

PP. 10-22

## 2021-07-24 #

1. Skimmed through the rest of Börner et al. (2019)

Encodings from the most accurate to the least: position, length, angle & rotation, area.

1. Simon, H. A. (1969). Designing organizations for an information-rich world. Brookings Institute Lecture.

What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention, and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.

In an information-rich world, most of the cost of information is the cost incurred by the recipient. It is not enough to know how much it costs to produce and transmit it; We must also know how much it costs, in terms of scarce attention, to receive it.

If we tell someone he can have certain information processing services free, or almost free, he may demand almost an infinite amount of them.

PP. 1-10

## 2021-07-23 #

1. Stuart, T. E., & Ding, W. W. (2006). When do scientists become entrepreneurs? The social structural antecedents of commercial activity in the academic life sciences . American journal of sociology, 112(1), 97-144.

I skimmed through it.

1. Börner, K., Bueckle, A., & Ginda, M. (2019). Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessments . PNAS, 116(6), 1857-1864.

PP. 1-4

## 2021-07-22 #

Read the reviews in Besançon & Dragicevic (2019)

It’s interesting to see what Professor Jessica Hullman says:

I suspect the difficulty for many researchers to learn alternatives to NHST may be one cause (for researchers being “selectively listening” to advice on statistical reform).

## 2021-07-21 #

👍 Finished Besançon & Dragicevic (2019)

The authors converted all CHI papers between 2010 and 2018 from PDF to texts. They wanted to understand the statistical reporting practices by analyzing the parsed texts. Their results showed that:

1. Exact p-values reporting has been increasing modestly.
2. Reporting confidence intervals has been increasing from 6% in 2010 to 15% in 2018.
3. Reporting exclusively p-value inequalities (such as p < 0.01) has been decreasing modestly.
4. Reporting exact p-values does not reduce dichotomous inference.
5. Papers that exclusively report confidence intervals are much less likely to report dichotomous inference.

## 2021-07-20 #

1. Finished Gulati et al. (2021)

2. Human deaths and injuries dominate damages caused by wild animals in India. Damages from losses of crops and livestock are very small compared to that from human casualties.

3. In terms of cost, damage by an elephant is over 600 times higher than that by herbivores: pig and nilgai.

4. Elephants contribute to the higher number of human casualties than any other animals.

5. Besançon, L., & Dragicevic, P. (2019, May). The continued prevalence of dichotomous inferences at CHI In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-11).

PP. 1-3

## 2021-07-19 #

1. Skimmed through the rest of Tsai et al. (2000)

For ABCs (American-born Chinese), “being Chinese” and “being American” are unrelated: they can be Chinese in one setting, and be American in another.

On the other hand, for immigrant Chinese, “being Chinese” and “being American” are negatively related: as they become more American, they become less Chinese.

1. Gulati, S., Karanth, K. K., Le, N. A., & Noack, F. (2021). Human casualties are the dominant cost of human–wildlife conflict in India . PNAS, 118(8).

PP. 1

## 2021-07-18 #

1. Rossman, G., & Fisher, J. C. (2021). Network hubs cease to be influential in the presence of low levels of advertising . PNAS, 118(7).

The authors' simulation shows that when diffusion involves external influences such as mass media, seeding with a hub, i.e., opinion leader, does not lead to faster adoption of an innovation. This implies that, for example, when promoting health information, massive advertising might be more effective than we previously thought.

1. Tsai, J. L., Ying, Y. W., & Lee, P. A. (2000). The meaning of “being Chinese” and “being American” variation among Chinese American young adults . Journal of Cross-Cultural Psychology, 31(3), 302-332.

PP. 1-3

## 2021-07-17 #

Finished Anderson et al. (2020)

• Algorithmically-driven listening on Spotify is associated with decreased diversity in users' music consumption.

• High diversity in content consumption on Spotify is associated with a higher level of user retention (i.e., staying on the platform) and user conversion (from trying the free version to paying for the premium service).

## 2021-07-16 #

1. Finished Xiong et al. (2019) .

Experiment 1: People perceived more causality from 2-bar bar charts, and less from scatter plots.

Experiment 2:

1. People rated bar visual encoding marks as the least causal, and dot encodings the most causal.
2. Decreased data aggregation (i.e., with more bins) drops the perceived causality; Increased data aggregation improves perceived causality.
• Experiment 3: scatter plots are perceived as the most causal, followed by line graphs. Histogram is the seen as the least causal.

• A comparison between Experiment 2 and 3 shows that data aggregation in data visualization is associated with more causality perceived.

There seem to be there mistakes in the article:

1. Caption of Fig. 10 should be “Three aggregation levels tested in Experiment 2..”?
2. Caption of Fig. 11 should be “Experiment 2 (top) and Experiment 3 (bottom)”?
3. The last sentence in Section 6.3 should be “… that aggregated data (Experiment 2) were perceived more causal than visualizations that did not (Experiment 3…)”?

1. Anderson, A., Maystre, L., Anderson, I., Mehrotra, R., & Lalmas, M. (2020, April). Algorithmic effects on the diversity of consumption on spotify . In Proceedings of The Web Conference 2020 (pp. 2155-2165).

PP. 1-3

## 2021-07-15 #

Xiong, C., Shapiro, J., Hullman, J., & Franconeri, S. (2019). Illusion of causality in visualized data . IEEE transactions on visualization and computer graphics, 26(1), 853-862.

PP. 1-8

## 2021-07-14 #

Finished Correll et al. (2020) .

Exaggeration caused by y-axis truncation persists no matter whether it’s a line chart or a bar chart, and no matter whether or not you explicitly inform the reader of the truncation.

## 2021-07-13 #

1. Finished Jones (2009).

2. Correll, M., Bertini, E., & Franconeri, S. (2020, April). Truncating the y-axis: Threat or menace? . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-12).

PP. 1-3

## 2021-07-12 #

Continued with Jones (2009).

An analysis of the data set of all the patents issued by the U.S Patent and Trademark Office (USPTO) between 1975 and 1999 shows that

1. Team size is rising;
2. The age at first innovation is rising;
3. It is less common for solo innovators to switch fields.

PP. 1-13

## 2021-07-11 #

1. Finished Way et al. (2017)
• Inequality in individual productivity has been decreasing. Measured by Gini coefficients, it has dropped from 0.62 in the 1970s to 0.40 in the 2000s.

• High-ranking institutions both select for and boost researchers' productivity.

1. Jones, B. F. (2009). The burden of knowledge and the “death of the renaissance man”: Is innovation getting harder? . The Review of Economic Studies, 76(1), 283-317.

PP. 1

## 2021-07-10 #

1. I skimmed through the rest of Giuntella et al. (2021).

2. Way, S. F., Morgan, A. C., Clauset, A., & Larremore, D. B. (2017). The misleading narrative of the canonical faculty productivity trajectory . PNAS, 114(44), E9216-E9223.

PP. 1-3

## 2021-07-09 #

1. Crowley, D. M., Scott, J. T., Long, E. C., Green, L., Israel, A., Supplee, L., … & Giray, C. (2021).Lawmakers' use of scientific evidence can be improved . PNAS, 118(9).

Research-to-Policy Collaboration (RPC), a way to improve lawmakers' use of research evidence (URE), can increase lawmakers' perceived usefulness of scientific research and their URE in the bills they introduce. In addition, RPC can boost researchers' knowledge, motivation, and actual behavior of policy engagement.

1. Giuntella, O., Hyde, K., Saccardo, S., & Sadoff, S. (2021). Lifestyle and mental health disruptions during COVID-19 . PNAS, 118(9).

PP. 1-2

## 2021-07-08 #

1. Pavelski, J. (2018). How to Get Lucky: The Secrets to Creating Your Own Good Fortune
• Show appreciation. Don’t take what others have done for you as granted.

• Take risks. Try new things. Meet new people.

• Embrace crazy ideas. Look at everything coming to you as a gift and embrace it.

• About taking risks: do something little first.

[P]assion follows engagement, not the other way around.

• You want an overlap in your passion, skills and market, and that’s where your sweet pot is.

Most of us choose where we’re going to live, who we’re going to spend time with, what kind of job we’re going to have. I think that so many people limit themselves, they make a box around themselves that’s much smaller than it needs to be.

1. Watts, D. J. (2007). A twenty-first century science . Nature, 445(7127), 489-489.

(Social networks) are not unitary, but multiplex, meaning that people maintain a portfolio of types of ties – formal, informal, strong, weak, sexual, business and friendship – each of which serves different functions.

## 2021-07-07 #

1. Finished Toubia, O., Berger, J., & Eliashberg, J. (2021). How quantifying the shape of stories predicts their success . PNAS, 118(26).

Three measures of texts: speed, volume, and circuitousness, can predict the success of movies, tv shows, and academic papers.

1. Kao, J., & Gillum, J. (2020). Reverse-Engineering an Audio Aggression Detection Algorithm . In ProPublica, Computational journalism Symposium.

I skimmed through it. It concludes that the Sound Intelligence aggression detection algorithm does not perform very well. There are many false positives and false negatives.

## 2021-07-06 #

1. Finished Ch.2 & 3 (PP. 9-59) of Chakrabarty, P. (2012). A guide to academia: getting into and surviving grad school, postdocs, and a research job. John Wiley & Sons.

• Learn skills, tools, and techniques early in graduate school.

• By the time you need qualifying exams, the committee will be more impressed if you already have several publications and grants under your belt than if you answer every question perfectly they ask you.

• Always bring a notebook and a pen with you. Take them out when you are talking with your advisor, with senior graduate students, or attending a seminar.

You didn’t go to graduate school to take classes; you are there to do original research.

When starting out as a new graduate student, one of the best ways to start working toward increasing your publication list is bry writing a review paper.

If you are interested in an undergraduate class, it is better to TA that class than to actually register to take it as a student; you will learn more as the TA, and you’ll be exempt from the headache of having to actually take exams.

(For TA classes), have designated office hours, but ask students to tell you in advance if they will be coming.

It is much easier to write your thesis as you go, submitting chapters as publications along the way.

Faculty don’t get very much credit for being on someone’s committee unless they are the primary advisor.

Be ambitious with your dissertation proposal.

1. Toubia, O., Berger, J., & Eliashberg, J. (2021). How quantifying the shape of stories predicts their success . PNAS, 118(26).

PP. 1-2

## 2021-07-05 #

Continued with Chakrabarty, P. (2012).

PP. 39-51

## 2021-07-04 #

Continued with Chakrabarty, P. (2012).

PP. 22-39

## 2021-07-03 #

Continued Chakrabarty, P. (2012)

PP. 10-22

## 2021-07-02 #

1. Finished Twenge, J. M. (2017). Have smartphones destroyed a generation . The Atlantic, 9, 2017.

In the next decade, we may see more adults who know just the right emoji for a situation, but not the right facial expression.

1. Chakrabarty, P. (2012). A guide to academia: getting into and surviving grad school, postdocs, and a research job. John Wiley & Sons.

PP. 9-10

## 2021-07-01 #

1. Finished Skipper, C. (2018). How (And Why) to Build Some Boredom Back Into Your Life

[T]he default mode is when you do your most original thinking.

Until we have real regulation of big tech, self regulation needs to be practiced and taught, too.

…this idea of feeding ourselves more and more and more information and then never actually doing anything with it… “Climate change is a disaster.” Well, okay, yes. But then what? Does that mean that you are going to make a donation? Does that mean that you are going to start an initiative at work? Does that mean you are going to talk to your kids about it?

1. Twenge, J. M. (2017). Have smartphones destroyed a generation . The Atlantic, 9, 2017.

PP. 1-6

# 2021-06 #

## 2021-06-30 #

1. Skipper, C. (2018). The Most Important Survival Skill for the Next 50 Years Isn’t What You Think

[Y]ou’ll be the target of attention-grabbing, behavior-modifying algorithms so exponentially effective you won’t even realize you’re being targeted. – The author of this interview

I don’t have a smartphone. My attention is one of the most important resources I have, and the smartphone is constantly trying to grab my attention. There’s always something coming in.

I try to be very careful about how I use technology and really make sure that I’m using it for the purposes that I define instead of allowing it to kind of shape my purposes for me.

The technology kind of dictated to you that this is what you’re going to do by grabbing your attention in such a forceful way that it can kind of manipulate you.

1. Skipper, C. (2018). How (And Why) to Build Some Boredom Back Into Your Life

PP. 1-4

## 2021-06-29 #

• Numbers of publications and citations are not good measures of societal impact.

• “Use-inspired research”, research that can help solve real social problems, such as misinformation, inequality, and future pandemics.

1. Way, S. F., Morgan, A. C., Clauset, A., & Larremore, D. B. (2017). The misleading narrative of the canonical faculty productivity trajectory . PNAS, 114(44), E9216-E9223.

## 2021-06-28 #

• New streams of funding.

• New ways of interdisciplinary collaboration among scholars. The current academic system does not encourage either students or faculty to collaborate with people from other disciplines.

• New relationships among academia, government, and industry.

• New training and education programs

• We need more data, and better data. It’s desirable if we can build a new class of research infrastructure where many researchers create and maintain datasets for collective use. This will bring data of higher quality, and facilitate reproducibility.

• The current system for obtaining industry data is “nonsystematic, nontrasparent, and inequitable.”

PP. 1-13

## 2021-06-27 #

👍 Jun, T., & Sethi, R. (2021). Extreme weather events and military conflict over seven centuries in ancient Korea . PNAS, 118(12).

The analysis is based on data extracted from historical records of ancient Korea.

• Countries experiencing extreme weather events are more likely to suffer from, rather than initiating, military attacks. This means that attacks during extreme weather events are more likely to be opportunistic rather than desperate.

• Food shortage mediates the relationship between extreme weather and being invaded. That is to say, extreme weather events have a significant effect on food insecurity which predicts invasion of a country.

• The possibility of successfully deterring an attack drops when the country being invaded is experiencing extreme weather events. The extent of this drop varies from country to country, though.

## 2021-06-26 #

Skipper, C. (2018). How to Kick Your Bad Habits (And Why That’s More Important Than You Think) .

• In the world we live now, people focus on results. Social media exacerbates the problem because people only share with others their results and highlights in life.

• A ture long-term thinking does not involve goals. Goals are about winning an instance of a game whereas systems are about “continuing to play the game.”

• You should view a habit you want to develop as a lifestyle (since you’ll have to do it forever) rather than a finish line to cross.

• For habits you wish to get better at, you need a process of reflection and review. Document your efforts and the results.

• To break a bad habit, you can either decrease your exposure to it or increase the difficulty to access it.

• In terms of a habit, even if you cannot do the whole thing, you should make a slight improvement. Throw away the ‘all-or-nothing’ mentality. If you act upon a habit until you feel it can transform your life altogether, you’ll end up achieving nothing.

• Find out the key moments in your day that determine how you spend the next one or two hours. Work backwards from those moments and try to figure out what was the trigger events that caused these moments.

• “Prune away the good things to get to the great things”. This means to find out what is most important for you.

## 2021-06-25 #

Finished Cinelli et al. (2021)

• The aggregation of homophilic groups of users determine interaction patterns in social media.

• In terms of news consumption, Facebook shows higher segregation.

• Whether the feed algorithm is able to be changed by users makes lots of difference. The algorithm is tweakable by users on Reddit but not on Facebook nor Twitter.

## 2021-06-24 (completed on 2021-06-25) #

1. Cinelli, M., Morales, G. D. F., Galeazzi, A., Quattrociocchi, W., & Starnini, M. (2021). The echo chamber effect on social media . PNAS, 118(9).

PP. 1-1

1. Finished Stieger et al. (2021)

### Research questions: #

1. Do personality traits change interventions work compared to the control group?

2. Do personality traits change in the desired direction?

3. Do other people detect these changes?

4. Can the changes be maintained?

### Results #

1. Participants in the intervention group show significantly greater changes compared to the control group.

2. In the intervention group, participants who want to increase or decrease a trait see their desired changes. These changes are on the traits they want to change.

3. Observers only see increases for participants who want to increase a trait, but do not see decreases for those who want to decrease a trait. Further analysis shows that observers only see a significant increase in conscientiousness.

4. When traits are measured by self-reports, they do not change 3 months after the end of the intervention among participants who want an increase on a trait whereas they witness another decrease among participants who want a decrease on a trait. Further analysis shows that participants who want a decrease in neuroticism showed another significant decrease after the end of the intervention whereas participants who want a decrease in openness showed a significant increase in openness.

When traits are gauged by observers, they increase 3 months after the end of the intervention among participants who want an increase on a trait. Traits are maintained for participants who want a decrease. Further analysis shows that only participants who want an increase in openness show a significant increase in openness after the end of the intervention. All other participants do not show significant changes.

### Discussions #

People in the control group also have motives to change their personality traits. They do not show significant changes because they do not receive treatment. This means that only having motives to change personalities do not necessarily lead to changes.

## 2021-06-23 #

Stieger, M., Flückiger, C., Rüegger, D., Kowatsch, T., Roberts, B. W., & Allemand, M. (2021). Changing personality traits with the help of a digital personality change intervention . PNAS, 118(8).

PP. 1-5

## 2021-06-22 #

1. Finished Skipper, C. (2018). Why Being Less Busy is the Key to Getting More Done

The most vital things in our work are rarely the most pleasurable, threatening, or noval. And that’s the problem with our attention.

The fascinating thing about email is: it takes up little of our time —— we maybe spend an hour or two on email a day —— but it takes up a disproportionate amount of our attention.

We can’t seamlessly switch from doing one thing to doing another thing. Our mind isn’t capable of that.

Notice what you pay attention to when you’re low on energy. So when you’re kind of burnt out a little bit, what apps do you fire up? What websites do you visit mindlessly, out of habit, that just stimulate you and prevent you from really resting? [Knowing those is] a way by which we can shut off autopilot.

When we say we don’t have time for something, we’re really just saying that something’s not important to us. You hear people say, “I don’t have time to read this book.” But then they have time to check email, or read the news, or go on social media for a few hours. What we really mean when we say we don’t have time for something is that we don’t have the attention or patience for it. You have time for everything. It’s just that you choose to do other things.

We are not able to focus and reflect at the same time. If you want to unearth more ideas and if you want more rest, I think you need to leave some space between the things you do.

Some people take a shower every morning, but they haven’t actually taken a shower in years, because their mind is everywhere but the shower. What use is a life that we don’t remember and experience?

Love is really no different than sharing quality attention with somebody.

1. Skipper, C. (2018). Why Self-Help Might Actually Be Making You Less Happy

[D]epression is the organism’s way of reacting, withdrawing, and perhaps metaphorically recharging the batteries. BUt now, there’s so much pressure in modern society to perform and to be productive, to be efficient, that we don’t have this time to recharge. We tend to pathologize these kinds of sadness or losses of energy.

Well, now we’ve actually had these therapies for 100 years and the world is just getting worse. How could that be? The answer is that all the sensitive, intelligent, resourceful people that should be out changing the world, they’re actually just sitting in the therapy room trying to realize themselves or improving themselves.

[But] staying with the image of the airplane, I would say the problem is nowadays that the plane is coming down, the pressure in the air cabin is falling, and the masks are coming down. People are putting on the masks, helping themselves, and breathing frantically into these masks. Then we call it mindfulness or therapy or self-help or whatever. And no one really gets up from the chair and tries to see what is going on with this aircraft. The pilots have passed out and the plane will actually crash because no one takes an interest in the whole structure. They only sit in their seat and breathe into their own masks. We actually need people who are brave enough to stop breathing into the mask and take an interest in the overarching structure.

That’s a really powerful metaphor!

I find it very important to be interested in something beyond yourself… Try not to be so obsessed with happiness. You have this happiness imperative: Life is about being happy. It’s ridiculous.

## 2021-06-21 #

1. Skipper, C. (2018). The Secret to Being a Productive Human: Take More Breaks (and Naps!)

To take more and better breaks.

Professionals take breaks, amateurs don’t take breaks.

Even a small break of two minutes is better than no breaks at all.

Moving is better than being stationary.

Be among nature: trees, being outside.

I do think there’s a false sense of urgency sometimes. I think it hurts out long-term performance.

Most people are “larks”, those who feel peaks in the morning. They should be vigilant, avoid distractions and do deep work in the morning. Try replying to routine emails in the afternoon. In the evening, do creative tasks that don’t require much vigilance and concentration.

1. Skipper, C. (2018). Why Being Less Busy is the Key to Getting More Done

PP. 1-3

## 2021-06-20 #

Skipper, C. (2019). Cal Newport on Why We’ll Look Back at Our Smartphones Like Cigarettes .

I don’t fear missing out. I fear not giving enough attention to the things that I already know for sure are important.

Yes, it’s scary not to be distracted, but I think it’s even more scary to avoid all of the deep good that comes from having to just be there with yourself, and confront all of those difficulties and opportunities that entails.

## 2021-06-19 #

Xie, Y. (2018). My Early Career Crisis (2014 - 2015)

• Reflect on your failures and make changes.

• Talk to people. Communicate.

• Your ego is your foe. Don’t let the outside prizes fool you.

• Know the person you want to be and do things to make it happen.

## 2021-06-18 #

Lutz, W., Reiter, C., Özdemir, C., Yildiz, D., Guimaraes, R., & Goujon, A. (2021). Skills-adjusted human capital shows rising global gap . PNAS, 118(7).

Mean years of schooling (MYS) alone is not sufficient to indicate a person’s true skills. The authors proposed a way to measure the quality of human capital. The authors employed the literacy skills dataset for countries across the globe to compute SLAMYS: skills in literacy adjusted mean years of schooling. Figure 3 shows that even though MYS indicates that the gap between the high-performing and low-performing countries is narrowing, the SLAMYS scores show that this gap is increasing fast. According to SLAMYS scores, the gap between countries above 3rd quartile and below 1st quartile is equivalent to ten years of schooling.

## 2021-06-17 #

Kizilcec, R. F., Makridis, C. A., & Sadowski, K. C. (2021). Pandemic response policies’ democratizing effects on online learning . PNAS, 118(11).

During the pandemic, DataCamp saw a 38% increase in the number of new users, and a 6% increase in exercise time by existing users.

Figure 2A shows that a slightly higher proportion of new registration comes from high-income and high-college neighborhoods. Figure 2B shows that people in low-income and low-college communities experience a larger increase in their time spent on DataCamp.

This study indicates that when people have more free time, they can use that time for upskilling, and this change is relatively consistent across regions with varying income, racial and educational compositions. This means that to alleviate the skill gaps in the US workforce, it’s a good idea to give employees more free time so that people can upskill.

## 2021-06-16 (Completed on 2021-06-17) #

Finished Nielsen, M. (2004). Principles of Effective Research . Read my notes here .

## 2021-06-15 #

Continued with Nielsen, M. (2004).

PP. 7-8

## 2021-06-14 #

Continued with Nielsen, M. (2004).

PP. 4-7

## 2021-06-13 #

Continued with Nielsen, M. (2004).

PP. 3-4

## 2021-06-12 #

Nielsen, M. (2004). Principles of Effective Research

PP. 1-3

## 2021-06-11 #

👍 Finished Kwak et al. (2020)

The FrameAxis is an unsupervised method to characterize the framing intensity and framing bias for a given text. See Figure 2 to understand these two constructs.

## 2021-06-10 #

1. Finished Kim et al. (2012)

Eye-tracker cannot capture scans of peripheral vision.

1. Kwak, H., An, J., Jing, E., & Ahn, Y. Y. (2020). FrameAxis: Characterizing Framing Bias and Intensity with Word Embedding . arXiv preprint arXiv:2002.08608.

PP. 1-2

## 2021-06-09 #

Kim, S. H., Dong, Z., Xian, H., Upatising, B., & Yi, J. S. (2012). Does an eye tracker tell the truth about visualizations?: Findings while investigating visualizations for decision making . IEEE Transactions on Visualization and Computer Graphics, 18(12), 2421-2430.

PP. 1-5

## 2021-06-08 #

1. Practicing privacy: Encryption by Matt Might
1. Why peer reviewers should use The Onion Router (TOR) by Matt Might
• Tor Browser: https://www.torproject.org/
1. Tips and software for Mac users by Matt Might

## 2021-06-07 #

Finished Seraj et al. (2021)

The authors analyzed posts in r/BreakUps subreddit, and found that a month before, during, and three months after the breakup, people’s analytical thinking drops (meaning their thoughts are less structured and logical) whereas cognitive processing increases (meaning that they are trying to understand the situation). Sample markers of cognitive processing (a.k.a, working through) include “because”, “understand”, “should'', etc. Also evident is that people are using more I-words and We-words.

The authors did not stop here and asked additional two questions: 1. Does writing a lot about the breakup help people recover? 2. How to compare these changes with language changes among people going through other kinds of emotional upheavals?

Most people (84%) spend 1-4 days talking about their breakups, whereas 16% of people spend 5 or more days. Both the short-term and long-term users have similar language patterns (analytical thinking, cognitive processing, I-words, and We-words) during the months leading up to the breakup. However, it takes long-term users twice as much time as short-term users to return to the baseline value of their language patterns. However, it’s impossible to establish a causal relationship. It might be that those who spend more time writing about their breakups experience a more difficult time and thus find it hard to adjust.

The changes in language patterns among people who experience divorce in r/Divorce mimic those of r/BreakUps users. But r/offmychest users have somewhat different patterns since they post various kinds of emotional upheavals.

## 2021-06-06 #

Seraj, S., Blackburn, K. G., & Pennebaker, J. W. (2021). Language left behind on social media exposes the emotional and cognitive costs of a romantic breakup . PNAS, 118(7).

PP. 1-5

## 2021-06-05 #

Finished Mirowski (2018).

❌ I personally don’t recommend this paper. It is unnecessarily long and contains so many complicated words which could have been replaced by simpler ones. The central theme, that open science does not make sense, does not have enough convincing arguments to persuade me.

## 2021-06-04 #

Continued with Mirowski (2018).

PP. 13-21

## 2021-06-03 #

Continued with Mirowski (2018).

• The assertion is wrong that science needs to be more democratic by engaging the public in the scientific process. Real science is hard. When you include the general public, what they can do is only simple tasks on the assembly line. It’s just meaningless. Furthermore, few ordinary people have the need to access and read scientific papers. Rather than serving the public’s interests, this movement is only doing good to corporations.

• Open science won’t accelerate the speed and quantity of scientific output.

PP. 6-13

## 2021-06-02 #

Mirowski, P. (2018). The future (s) of open science . Social studies of science, 48(2), 171-203.

• Open science, in the sense of including the public in the scientific process, does not increase the authority of science among the general public. Also, it seems that more education makes the conservatives and the liberal have more differing opinions towards science’s role in the public.

PP. 1-6

## 2021-06-01 #

👍 Barres, B. A. (2013). How to pick a graduate advisor . Neuron, 80(2), 275-279.

See my summary in Chinese here .

# 2021-05 #

## 2021-05-31 #

Newport, C. (2020). When technology goes awry . Communications of the ACM, 63(5), 49-52.

In this Viewpoint, Cal Newport argues that it’s better to see the impact of technologies from a determinist viewpoint, that new technologies drive human behaviors into unplanned directions. He takes emails invented by IBM and the “Like” button by a Facebook engineer as examples. Emails were meant to facilitate internal communication within IBM. However, people began sending much more messages the moment they started emailing. The “Like” button, designed to replace some approving comments such as “nice!” and “great!”, made people check their Facebook account constantly for rewards.

The author later coins the term complex side effects, negative impacts that new technologies may have that are not expected when the technologies were invented. Cal Newport argues that engineers should include the evaluation of complex side effects in their “iterative engineering process”.

Cal Newport mentions the website of Center for Humane Technology initiated by Tristan Harris .

## 2021-05-30 #

Finished Peyton-Jones, S. (2014). How to write a great research paper .

1. When you have an idea, start writing the paper now! Don’t wait until your research is finished. No! “Writing papers is a primary mechanism for doing research, not just for reporting it”.

2. The primary goal of your research is to communicate a “useful and re-usable idea”. This idea should be “clear” and “sharp”. After reading your paper, the readers should be 100% sure what THIS idea is. Distill your paper into one single sentence which conveys the main idea, and make this idea explicit: “The main idea of this paper is …”, “In this section we present the main contributions of this paper.” Many papers have great ideas but they didn’t “distill what they are”.

• Of course, you need to convey a single idea but don’t have the pressure to only present a fantastic idea. Rather, you should “write a paper, and give a talk, about any idea, no matter how insignificant it may seem to you”.

Your readers should be able to sum up your paper with one single sentence: This paper finds that…

1. The narrative flow: Here is a problem -> It’s an interesting but unsolved problem -> Here is my idea (to solve it) -> My idea works (with details and data) -> Here is how my idea compares to others' approaches.
• Abstract (4 sentences)
• Introduction (1 page)
• The problem (1 page)
• My idea (2 pages)
• The details (5 pages)
• Related work (1-2 pages)
• Conclusions and future work (0.5 page)
1. The introduction does two things in one page only: (1) “Describe the problem”, and (2) “State your contributions”. When introducing the problem, use an example! When stating your contributions, try using bullet points to make the structure clearer.
• Do not write this in your introduction: “The rest of this paper is structured as follows. Section 2….” Instead, your introduction is supposed to survey the whole paper.
1. Between Introduction and The problem, don’t insert Related work (Literature review). Instead, put it between The deatils and Conclusions and future work.

2. When presenting your idea, use simple examples that even a novice can understand.

• When your draft is done, you can send it to the author of the paper on which your research is based. Ask them “could you help me ensure that I describe your work fairly?”. Since they are interested in this, they most often will reply.

• Be grateful for every single word your reviewers put in the review. Treat criticism as positive suggestions that can help make your work clearer. Even if the criticism is wrong and it turns out the reviewer did not spend time and effort on understanding your (completed) work, try simplifying it!

• Follow instructions by the organizer closely
• Spell check
• Use strong visual structure in your paper: sections and subsections, bullet points
• Use nice pictures/diagrams
• Use simple words and phrases

## 2021-05-29 #

Peyton-Jones, S. (2014). How to write a great research paper . Video of this presentation .

PP. 1-27

## 2021-05-28 (Completed on 2021-05-29) #

👍👍 Eisner, J. (2010). Write the Paper First .

The key idea of this post is that if you are planning to submit a paper to a conference, you need to write your paper right now! It’s totally Okay, and even expected, if you haven’t run the experiment and don’t have any tangible results. You can put empty tables and figures in the Results section. The main goal of the experiments is to allow you to put numbers in those empty tables/figures. Remember, you should never postpone writing until your experiment is finished.

### Why? #

1. Clear writing triumphs over your experimental results in terms of (1) odds of acceptance and (2) likelihood of being cited.

2. If you complete the writing before you do the experiment, you can send it to your advisor or colleagues who can give you suggestions. If they think the idea you want to convey sucks, you don’t need to waste your time actually doing the experiment. And your writing isn’t a waste of time since you may find the idea applicable elsewhere.

### What should you do? #

Write the Introduction, where you need to present a big picture and sell your idea. Then, the lit review. Don’t do a lit review before writing your own ideas. This is because after writing your own ideas, you can see them from a different perspective when writing the lit review. Then write the methods/experimental and even the results section. (The author didn’t mention the results section. I think it is possible, though.)

Every paper needs a beautiful picture at least. But don’t spend time refining it now. Simply put your sketch first.

Remember to document your codes early on. Don’t wait until your project is finished. Don’t fool yourself.

## 2021-05-27 #

1. Finished Li et al. (2021)

2. Tsai, J. L. (2021). Why does passion matter more in individualistic cultures? . PNAS, 118(14).

People in Western individualistic societies value passion and excitement whereas East Asian collectivistic societies value calm and balance. This is why you see so many faces with big toothy smiles in Western societies but less frequently see them in East Asian countries and regions.

## 2021-05-26 #

Li, X., Han, M., Cohen, G. L., & Markus, H. R. (2021). Passion matters but not equally everywhere: Predicting achievement from interest, enjoyment, and efficacy in 59 societies. PNAS, 118(11).

The study finds that among over 1 million 15-year old students around the globe, passion predicts academic achievement, i.e., scores in science (2015), math (2012), and reading (2009), more strongly in individualistic societies than in collectivistic ones. The result remains the same after controlling for GDP per capita. Other cultural differences than individualism-collectivism are not robust predictors of the relationship between passion and academic achievement. Note that it doesn’t mean passion is unimportant in collectivistic societies; Passion is still a significant predictor of academic achievement in these societies.

Support from parents predicted academic achievements more strongly in collectivistic societies than in individualistic ones.

PP. 1-8

## 2021-05-25 #

Potter, L., Kalubi, D., & Schönenberger, K. (2021). Opinion: Academic-humanitarian technology partnerships: an unhappy marriage? . PNAS, 118(11).

The authors of this opinion piece argue that partnerships between humanitarian and development organizations (HDOs) such as ICRC, UNICEF, and WHO, and academics may fail.

1. Funding and human resources may be limited.
• HDOs think that universities may have their own funding for experimental studies. However, truly experimental work is not sufficient to produce products that satisfy HDOs' needs as many products are expected to be working in very harsh conditions.
• Researchers in universities may see HDOs as an external funding source. However, HDO funds need detailed reporting to donors. Since research needs lots of money and may not produce any tangible results at all, it’s very difficult for HDO to spend many financial resources on R&D.
• HDO may face a lack of human resources committed to the partnerships. There is lots of paperwork to do and HDO staff already have their tasks. They may not have time to write project plans and funding reports.
1. Deployment and sustainability
• Products, for example, medical equipment, need training, maintenance, and upgrade services. However, the company that is tasked with producing the equipment has little interest in these. Training and maintenance needs money which normally comes from selling loads of these products. However, HDO only needs a limited number of them. This won’t create profits for the company.

• Once the partnership produces a publication, researchers are no longer interested in it. They have to move on to other projects that produce papers.

1. Roles, responsibilities and expectations
• When HDOs find an interesting publication, they may invest thousands of dollars to implement it. However, there might exist a huge gap between academic publication to implementation. HDOs may waste lots of money.

• HDOs have a longstanding distrust in commercial sectors and believe academia is neutral. This is wrong. Universities nowadays are very commercialized. HDOs should update their view on companies and universities.

## 2021-05-24 #

1. Finished Nielse & Andersen (2021)

2. Nichols, J. D., Oli, M. K., Kendall, W. L., & Boomer, G. S. (2021). Opinion: A better approach for dealing with reproducibility and replicability in science . PNAS, 118(7).

The authors of this opinion argued that a better solution to the replicability crisis in science is to shift the focus from replicating single studies to designing and conducting progressive sequences of studies that accumulate evidence on a certain topic. Researchers should be encouraged to participate in these progressive sequences of studies. Administrations of universities should shift the reward system from emphasizing single studies to these sequences of studies. Funding agencies such as NSF should do the same.

## 2021-05-23 #

Nielsen, M. W., & Andersen, J. P. (2021). Global citation inequality is on the rise . PNAS, 118(7).

• In the fields of health sciences, agricultural sciences, and natural sciences, the citation share by the top 1% scientists increased from 14.7% in 2000 to 19.6% in 2015. The publication share by the top 1% elite increased from 5 to 12% (fractionalized output) during the same period.

• The citation elite increased its share of publications (in fractionalized output) and citations but its productivity in fractional counts and impact per paper decreased.

• The citation share of the top 1% scientists decreased by 7% in the field of computer and information sciences. That’s the largest decrease in all fields.

• Citation elite is increasingly found in Western Europe and Australasia. Citation shares by the top 1% scientists decreased in the US, China, and Japan. Citation share by the top 1% scientists decreased in renowned universities in the US but increased in renowned institutions in Western Europe and UK.

PP. 1-6

## 2021-05-22 #

1. Scheufele, D. A., Hoffman, A. J., Neeley, L., & Reid, C. M. (2021). Misinformation about science in the public sphere . PNAS, 118(15).

Five themes in the colloquium issue:

1. What we do not know about misinformation. For example, how frequent are they? How successful are the coping strategies?

2. What’s wrong with the science community itself? Why are we producing misinformation even as scientists?

3. How to intervene? And how effective are these interventions? For example, are short-term corrections working in the long run? Why or why not?

4. Think of the issue of misinformation and possible solutions in a new light. For example, storytelling in science publications?

5. What does it mean to be scientifically literate?

6. Clauset, A. (2021). Prediction and it limits for scientific discovery .

## 2021-05-21 #

Nanayakkara, P., & Hullman, J. (2020). Toward Better Communication of Uncertainty in Science Journalism . Computation + Journalism, March 2020, Boston, MA, USA

Stray, J., & Hao, K. Interactive Visualization of Fairness Tradeoffs . Computation + Journalism, March 2020, Boston, MA, USA

## 2021-05-20 #

Finished Borkin et al. (2015)

The authors web scripped close to 5,700 visualizations, among which there are 2,070 single visualizations. Then then annotated these visualizations based on their taxonomy for static visualizations (See 3 Visualization Taxonomy). They also tagged the source of each visualization, i.e., infographics, scientific publication, news, and government/world organizations.

They selected a subset of 410 visualizations and presented them to participants on Amazon MTurk. Each visualization is treated as an image and is presented for only 1 second. Participants press the key if they see a visualization for the second time. Therefore, this study looks at which types of visualizations as images are more memorable.

### Results #

• If a visualization is more memorable. It is more memorable for most people. That is to say, “[T]he memorability of a visualization is a consistent measure across participants.”

• The following attributes in visualizations contribute to memorability: having pictograms, colorful, chart junk, high visual densities.

• Unique types of visualizations, such as diagrams, networks, and trees are more memorable than common types such as bar, line, and pie charts.

• Infographic visualizations are the most memorable, followed by visualizations from scientific publications, news, and government. Scientific visualizations are memorable possibly because they contain more diagrams.

### Further thoughts #

1. How about interactive visualizations? What characteristics make interactive visualizations more memorable?

2. How about testing memorability a week after the experiment as well?

## 2021-05-19 #

Borkin et al. (2013)

PP. 1-7

## 2021-05-18 #

1. Finished Borkin et al. (2015)

This work is exciting.

The authors used the same dataset as in What makes a visualization memorable?. 393 single visualizations are selected, among which 100 are randomly chosen to be shown to participants in the Encoding Phase. Participants spent 10s on each visualization in this Phase. Then, in Recognition Phaseparticipants are shown 200 visualizations (100 are shown before, the other 100 are filler visualizations). The duration of each visualization is 2 seconds. Participants press keys to indicate whether they have seen each visualization in the previous Phase. Lastly, in the Recall Phase, each participant is shown the visualizations they managed to recognize in the Recognition Phase. These visualizations are artificially blurred by the authors. Participants are asked to input any information they remember of each visualization in an empty box right next to the visualization.

Here are what the authors find:

• If a visualization is memorable “at a glance” (where people see it for only 1 second), it is also more likely to be memorable if people see it for 10 seconds.

• “Human recognizable objects” help make visualizations recognizable.

• Visualizations that are the most recognizable after 1 second exposure are likely to get descriptions of the highest quality after 10-second exposure.

• A good title helps people recall the visualization.

• Pictograms do not hinder recall.

• Data and message redundancy help people recognize and recall the visualization.

1. Borkin, M. A., Vo, A. A., Bylinskii, Z., Isola, P., Sunkavalli, S., Oliva, A., & Pfister, H. (2013). What makes a visualization memorable?. IEEE transactions on visualization and computer graphics, 19(12), 2306-2315.

PP. 1

## 2021-05-17 #

Borkin et al. (2015)

PP. 2-7

## 2021-05-16 #

This piece argues that as scientists, we should inform rather than persuade the audience. Science is not about shaping the public’s “emotions and beliefs”.

• Key to trustworthiness: expertise, honesty, and good intentions.

• Show uncertainties. Let the audience know if you are unsure of your results.

• Preemptively debunk possible misunderstandings (i.e., “prebunking”)

Always aiming to ‘sell the science’ doesn’t help the scientific process or the scientific community in the long run, just as it doesn’t help people (patients, the public or policymakers) to make informed decisions in the short term.

1. Borkin, M. A., Bylinskii, Z., Kim, N. W., Bainbridge, C. M., Yeh, C. S., Borkin, D., … & Oliva, A. (2015). Beyond memorability: Visualization recognition and recall . IEEE, 22(1), 519-528.

PP. 1-2

## 2021-05-15 #

PP. 1-2

1. Finished How to Be a Successful PhD Student by Mark Dredze and Hanna M. Wallach (2012).

• The fact that you like a professor’s research doesn’t necessarily mean that you will be comfortable working together.

• It’s better to meet your advisor regularly, for example, weekly. Remember to make an agenda for each meeting. Bring your results to the meeting. Start each meeting with the summary of points in the last meeting.

### Be productive #

• To be productive, creative, and independent.

[T]he purpose of graduate school is research, not taking classes. Although taking classes is part of graduate school, when it comes to success, it’s all about research. Do well enough in your classes but focus on publishing high quality research papers.

• “Talk to other students”.

• It’s better to work in the lab for at least 20 hours each week. If you cannot focus on your studies/research in the lab, talk to your advisor and find a solution.

• Prioritize.

• If you aren’t healthy or happy, you won’t be productive in research.

### Research #

• Take notes of every paper you read. Some need detailed notes, whereas others need simple notes.

• When it comes to important papers in your field, read deeply. Understand them thoroughly.

#### Research ideas #

• When choosing a research topic, you need to know the literature, and the “online dialogue” within your field.

Try to focus on big problems rather than making incremental improvements to previous work.

You will be judged on what you publish, not how long it took you to come up with the idea.

• Only work on solutions to problems that do exist.

#### Publishing #

• Organize your folders, codes, texts, etc. IMMEDIATELY after a paper submission. If you don’t do it then, you won’t do it later either.

### Talks #

• Practice. Practice a lot for a talk.

• Seek feedback/suggestions from friends/colleagues.

• Focus on content, not form.

• Pay attention to Timing.

### Professional development #

1. Do internships, either early one or later when you have special skill sets. It doesn’t hurt to have more than one internship.

2. Start reviewing papers in your field early on.

### Progressing in PhD #

• Year 1 - Year 2/3: Take classes, find an advisor, read tons of papers, observe how others do research, finish a project with results (not necessarily a publication).

• Year 3 - Year 4/5: Choose a research area, and publish papers in that area.

• Year 5/6: Write your thesis

### Networking #

Making tutorials will help people know your name. Attend conferences frequently and introduce yourself.

### Jobs #

Know what makes you happy (industry, government, starting a company, teaching, research, eetc) and do it after your PhD.

## 2021-05-14 #

How to Be a Successful PhD Student by Mark Dredze and Hanna M. Wallach (2012).

PP. 3-11

## 2021-05-13 #

### Timing #

In the Summer a year before you graduate, complete your CV, research statement, and teaching statement. Start sending your applications to schools in the Fall a year before you graduate. You can apply through February. You probably will get interview invitations from February to early May. You’ll need 3-6 recommenders, so let them know in advance.

### Readings that might help #

• Even a Geek Can Speak
• Tomorrow’s Professor
• A PhD Is Not Enough: A Guide To Survival In Science

### Application materials #

• Have a homepage. Attach a profile photo on it. Put your job materials in HTML on your site in addition to PDF.

• Take time to tailor your cover letter to specific schools.

• In cover letter, bold face faculty members. Keep the cover letter brief, around 1-2 pages.

### Who’s hiring #

• Go to a conference in your field a year before you graduate and ask people whether they are hiring.

• Mention in your homepage that you are graduating and looking for a job.

### Job talk #

• Practice 10 times before your first job talk. Practice twice for subsequent talks.

### Miscellaneous #

• Email a personalized message to everybody you talked to the next day.
1. How to Be a Successful PhD Student by Mark Dredze and Hanna M. Wallach (2012).

PP. 1-3

## 2021-05-12 #

1. A thesis proposal is a contract by Matt Might
• Make the thesis statement as short as possible. It asks the question of “What will humans learn as a result of this dissertation?”

• A thesis proposal is a contract. Try to make the contract as specific as possible. Remember to put in your proposal some milestones (e.g., completing literature review, complete surveys, submit for publication, etc.), dates and the completion criteria of these milestones. “The proposal needs to create the impression that failure is unlikely”.

Good proposals give the impression that between one-third and two-thirds of the work remains to be completed.

• A proposal should be around 15 pages, not longer. Limit the presentation time to 30-45 minutes.

PP. 1-5

## 2021-05-11 #

Finished Chun (2017)

Overview: The study finds that a redundant encoding (here, value represented with shades of gray) has no benefits nor harm for (1) accuracy and (2) speed of judgement of graphs encoded with single primary encodings (position, angle, area, length).

### Study aims #

The study examines whether a redundant encoding, in this case, values represented by shades of gray, will boost or harm accuracy and speed of understanding graphs encoded with various primary encodings (position, length, angle, and area).

The study also examines whether the additive benefit of value, if any, to accurate encodings (e.g., position) will be greater than to less accurate encodings (e.g., area).

### Results #

1. Position is the most accurate encoding, followed by angle, area, and length.
2. The redundant encodings do not have any effects on the understanding of graphs encoded with the four primary encodings. The hierarchy of accuracy with redundant encodings is the same as that with single primary encodings.
3. Participants completed tasks on graphs with redundant encodings in the same amount of time as participants completing tasks on graphs with single primary encodings.

### Discussions #

1. Readers might have filtered out the redundant encoding, i.e., values represented with shades of gray, and focused on primary encodings of the graph.
2. An open question: If we pair visual variables that have similar accuracy properties, for example, angle & area, will we see a boost or drop in accuracy of understanding & completing time?

## 2021-05-10 #

Chun, R. (2017). Redundant Encoding in Data Visualizations: Assessing Perceptual Accuracy and Speed . Visual Communication Quarterly, 24(3), 135-148.

PP. 1-11.

## 2021-05-09 #

1. How to send and reply to email by Matt Might

• Keep the Subject as informative and actionable as possible.

• Don’t use very long paragraphs. Try to break them into points.

• When replying to an email, reply to its points rather than the whole of it.

1. What every computer science major should know by Matt Might

I don’t think the tips in this post are “realistic”. For undergraduate students, it’s good if you want to learn every subfield within computer science. But for PhD students, this might not be good.

• Learn statistics, linear algebra, and calculus.

• Try Linux.

• Take a compiler class.

• Learn to program in C and JavaScript.

## 2021-05-08 #

You will also need to actively, even aggressively, forge relationships with scholars in your field. Researchers in your field need to know who you are and what you’re doing. They need to be interested in what you’re doing too.

[Y]ou have to spoon-feed the experts. As you write, you have to consciously minimize the amount of time and cognitive pain it takes for them to realize you’ve made a discovery.

[T]he only way to get better at writing is to do a lot of it.

That’s why I recommend that new students start a blog. Even if no one else reads it, start one. You don’t even have to write about your research. Practicing the art of writing is all that matters.

1. 6 blog tips for busy academics By Matt Might

## 2021-05-07 #

1. How to get a great letter of recommendation by Matt Might

2. Tips for work-life balance by Matt Might

3. Finished Guilbert. (2020)

## Papers #

[T]he process of discovering what makes a great paper great is itself an invaluable learning experience.

For a 1st or 2nd year student, that plan might be delivering individual sections on a schedule for in-depth feedback, followed by rewrites by the students, followed by extensive editing and rewriting by me.

[E]everything needs to be essentially done by 7 days before the deadline: the studies have to be completed, the data analyzed, the findings solidified, the message of the paper needs to exist, etc.

The paper doesn’t have to be perfect: the bar I use is that I would submit this paper and I wouldn’t be embarrassed.

If the paper isn’t ready, we won’t submit it to that conference or journal deadline; we’ll submit it somewhere later.

PP. 9-18

### Classes #

Many students, having excelled in classes their entire academic lives, have trouble letting go of this, and need to excel in every class they take while in the PhD program. In my opinion, this is a common reason for burnout in grad school. When you go on the job market, almost no one will care what classes you took during grad school and how well you performed in them. Everyone will care about your research and what contributions you made.

The PhD program is all about learning new things, but as your research problems lead you there.

You might seek out a class to help you master something you know you need in research. But let the research drive that exploration and those decisions.

## Talks #

• Plan and practice your talks.

• It’s Okay to (1) repeat the questioner’s question and ask whether it’s what they asked; and (2) wait for a second or two before you answer that question.

## Life #

Please take care of yourself, in whatever way you need … For example, I need to sleep, exercise, and spend time with my family. I prioritize those things every day.

Setting everyday boundaries can be very difficult in academia, but I think it has really helped me over the long term.

## 2021-05-06 #

Continued with Guilbert. (2020)

PP. 9-18

## 2021-05-05 #

1. Gilbert, E. (2020). Syllabus for Eric’s PhD students

PP. 1-9

1. Finished Jabr (2019)

It’s easy to lose sight of the fact that even something that seems minor, like a filling, involves removal of a human body part. It just adds to the whole idea that you go to a physician feeling bad and you walk out feeling better, but you go to a dentist feeling good and you walk out feeling bad.

I totally feel the physical and psychological pain of those who underwent Lund’s procedure. I had trouble with no more than two teeth and I felt it was unbearable. What Lund did was truly disgusting!

## 2021-05-04 #

Jabr, F. (2019). The Truth About Dentistry . The Atlantic.

PP. 1-14

## 2021-05-03 #

Daniel Engber (2016). Is “Grit” Really the Key to Success?

Key points in this essay:

• Duckworth promises in her book more than what “grit” can deliver.

• Grit is nothing more than other personality traits, such as industriousness.

• Grit matters but only in certain situations in life, not all of them. There is a “restriction of range”. Some papers concluded that grit does not predict academic success, in New England or in Austria.

• Grit makes some differences in life, but certainly not “all the difference in the world”.

• Grit is perhaps just a rebranding, rather than producing something new.

• It might be impossible to change your personality.

• Being high in grit might be harmful. For example, you might hurt yourself physically if you are too focused on a goal.

Some quotes in this essay:

And like other self-help authors, she pulls a sleight of hand by which even widely held assumptions end up looking like discoveries. It’s as important to work hard, the book contents, as it is to be a natural talent. Who would disagree with that?

For most people, life may be less like a marathon than a series of sprints, interspersed with periods of rest and hours upon hours spent browsing the internet.

It’s one thing to argue that grit matters more than talent or——more accurately——that your personality helps determine your success. Duckworth goes much further, asserting that you can change your personality and learn to “grow your grit.”

It could be that having too much strength of purpose is worse than having not enough.

## 2021-05-02 (completed on 2021-05-03) #

Jonah Lehrer (2011). Which Traits Predict Success? (The Importance of Grit)

Taken together, these studies suggest that our most important talent is having a talent for working hard, for practicing even when practice isn’t fun.

## 2021-05-01 #

Finished Hoffman et al. (2016)

• Americans without medical training on average endorse 22.43% of the false beliefs of biological differences between White and Black people. 73% of the participants endorse at least one false belief. For Americans with medical training, these two figures are 11.55% and 50%, respectively. This is surprisingly high, given that they have received formal medical training.

• Participants in Study 2 (i.e., those who received medical training) who endorse more false beliefs report that a Black person feels less pain. These participants are also less accurate in their treatment recommendations for black targets.

• Participants in Study 2 who do not endorse, or endorse fewer, false beliefs, report that a White person feels less pain. However, these participants are not biased in their treatment recommendations for White people.

# 2021-04 #

## 2021-04-30 #

1. Hoffman, K. M., Trawalter, S., Axt, J. R., & Oliver, M. N. (2016). Racial bias in pain assessment and treatment recommendations, and false beliefs about biological differences between blacks and whites . PNAS, 113(16), 4296-4301.

PP. 1-2

1. Finished Edwards et al. (2019)
• Fig. 1 shows that African American men are much more likely to be killed by police than men of other ethnicities. Women are much less likely (20 times lower, to be exact) to be killed by police use of force than men are.

• Fig. 2 shows that Black men are 2.5 times more likely to be killed by police than are white men. This number is 1.2-1.7 for American Indian/Alaska Native men, 1.3-1.4 for Latino men, and 0.5 for Asian/Pacific Islander men.

• 1 in 1000 black men and boys will be killed by police over their lifetime. This is a nontrivial death risk.

• 1.6% of deaths involving black men between 20 and 24 years old are caused by police use of force.

## 2021-04-29 (Completed on 2021-04-30) #

1. Tessum, C. W., Apte, J. S., Goodkind, A. L., Muller, N. Z., Mullins, K. A., Paolella, D. A., … & Hill, J. D. (2019). Inequity in consumption of goods and services adds to racial–ethnic disparities in air pollution exposure . PNAS, 116(13), 6001-6006.

I skimmed through this paper but did not go into details. The conclusion is that non-Hispanic white people caused more PM2.5 than they experienced.

1. Edwards, F., Lee, H., & Esposito, M. (2019). Risk of being killed by police use of force in the United States by age, race–ethnicity, and sex . PNAS, 116(34), 16793-16798.

PP. 1-3

## 2021-04-28 #

Kowal, M., Groyecka-Bernard, A., Kochan-Wójcik, M., & Sorokowski, P. (2021). When and how does the number of children affect marital satisfaction? An international survey . Plos one, 16(4), e0249516.

• Survey takers (7178 married individuals from 33 countries/regions) expressed significantly a lower level of marital satisfaction with more children they have. This negative association is significantly only among women. Figure 3 showed that men are more satisfied in marriage than women.

• Fig. 2 showed that when the number of children is fewer than two, then the more education people receive, the more satisfied they are with their marriage. The results are the opposite when the number of children exceeds two. Caution: I do not know whether these differences are significant.

• Wealth is not a significant moderator: rich or poor, you’ll be less satisfied with the marriage if you have more children.

## 2021-04-27 #

Finished Wu et al. (2017)

I did not get the main points in this paper.

## 2021-04-26 #

1. Wu, Y., Xu, L., Chang, R., & Wu, E. (2017). Towards a bayesian model of data visualization cognition . In IEEE Visualization Workshop on Dealing with Cognitive Biases in Visualisations (DECISIVe).

PP. 1

1. Finished West & Bergstrom (2021) .

I believe this work is a summary of this book/course by the same authors.

• Replication crisis, especially in social and biomedical sciences
• Preregistration may not be a panacea as it discourages exploratory analyses.
• Quotation errors: 1/5 to 1/10 scientists interpreted others' work inaccurately when citing the work
• Many retracted papers are still cited after retraction.
• Predatory publishers are a real problem. Chances are authors are aware that the publishers are not legitimate but they go ahead anyway due to pressure to publish a lot. Also, if malicious people publish misleading work intentionally, the public will be misinformed and gradually have a lower trust in science overall.
• arXiv papers that later published in prestigious venues attracted more citations than those found their places in less well-known journals. This indicates journals are a good gatekeeper. However, there is another side of the story: highly cited arXiv papers are less likely to be accepted for publication somewhere else at all!
• There might be a “Google Scholar bubble”: less diversity in searching results; Matthew Effect
• Data, when processed, analyzed, or presented in a wrong way, can misinform.

## 2021-04-25 #

Continued with West & Bergstrom (2021).

PP. 3-6

## 2021-04-24 #

West, J. D., & Bergstrom, C. T. (2021). Misinformation in and about science . PNAS, 118(15).

PP. 1-3

## 2021-04-23 (Completed on 2021-04-24) #

Tilghman, S., Alberts, B., Colón-Ramos, D., Dzirasa, K., Kimble, J., & Varmus, H. (2021). Concrete steps to diversify the scientific workforce . Science, 372(6538), 133-135.

• By 2045, no single ethnic group will be a majority in the US.

• How can programs encourage minority groups to be engaged in science: Create communities using cohorts; offer personalized mentoring; provide financial support.

• How to diversify the scientific workforce: a major federal initiative; educational institutions form a culture that welcome minorities into scientific research; more inclusive funding policies for minority scientists.

## 2021-04-22 #

### Finished Kim et al. (2017)

Aim: To investigate how other people’s expectations of the data influence people’s reactions to a visualization: ability to recall, and the trust in, the data.

Hypotheses:

• H1: Social information leads to better recall of the data shown.

• H2: Congruence in social signals leads to higher trust, and higher likelihood of updating their beliefs.

• H3a: Congruent-HighConsensus leads to higher trust and higher likelihood to update beliefs; Incongruent-HighCnsensus leads to the opposite results.

• H3b: HighConsensus leads to better recall of the data.

Results:

• H1 ❌: The presence of social information has no impacts on data recall capability.

• H2 ✔️: Congruence improves trust, and the likelihood to update beliefs.

• H3a ❌: The level of Consensus has no impacts on trust and likelihood of belief update in both Congruent and Incongruent groups.

• H3b ✔️: HighConsensus leads to improved recall.

Other findings (which do not seem preregistered):

1. Social information leads to improved recall when there is high consensus in the social signal.

2. When participants had lower trust in the data even before seeing it, they are more likely to be swayed by the social information they see: lower trust in the data when others disagree with the data, and higher trust when others agree.

### Surprise, surprise #

Munnich, E., Ranney, M. A., & Song, M. (2007). Surprise, surprise: The role of surprising numerical feedback in belief change . In Proceedings of the Annual Meeting of the Cognitive Science Society(Vol. 29, No. 29).

The more surprised you are by a numerical feedback, the more accuracy you’ll have in recalling this number later.

## 2021-04-21 #

Continued with Kim et al. (2017)

PP. 7-8

## 2021-04-20 (Completed on 2021-04-21) #

Continued with Kim et al. (2017)

PP. 4-7

## 2021-04-19 #

Kim, Y. S., Reinecke, K., & Hullman, J. (2017). Data through others' eyes: The impact of visualizing others' expectations on visualization interpretation . IEEE vis, 24(1), 760-769.

PP. 1-4

## 2021-04-18 #

Finished Pandey et al. (2014)

This study finds that when participants hold a neutral opinion on a topic or are weakly polarized, charts lead more people to change their views compared to tables (that communicate the same data as in charts). This pattern also exists among positively polarized participants.

A reverse pattern is observed among negatively polarized participants: more people change their opinion when presented with tables. It should be pointed out that this difference is statistically significant for only one of the three topics, although when aggregating the data from all the three topics, the difference is statistically significant.

## 2021-04-17 #

Pandey, A. V., Manivannan, A., Nov, O., Satterthwaite, M., & Bertini, E. (2014). The persuasive power of data visualization . IEEE transactions on visualization and computer graphics, 20(12), 2211-2220.

PP. 1-5

## 2021-04-16 #

Finished Watts et al. (2021)

Lots of publications on misinformation are based on data from social media or websites. However, a nationally representative survey finds that (1) news is a small fraction of media consumption; (2) online news is a small fraction of news consumption; and (3) fake news is only a tiny portion of the information Americans receive.

The authors proposed a comprehensive research agenda that aims to achieve four objectives:

1. To build a large-scale data infrastructure to study the production (an updated catalog of relevant information), consumption and distribution (to measure people’s news consumption continuously across all platforms: social media, desktop, email, instant messaging, etc.), absorption and understanding (whether news consumption changes people’s beliefs), action and engagement (to see whether information people receive translates into action. For example, to protest, to volunteer, or simply to follow/retweet/like/comment on social media).

2. To build a “Mass Collaboration'' Model where people can replicate and accumulate knowledge.

3. To communicate insights obtained from research to nonacademic people. One possible approach is to create data visualizations that constantly update new data.

4. To build an academic-industry collaboration.

## 2021-04-15 #

Continued Watts et al. (2021)

PP. 2-6

## 2021-04-14 #

1. I skimmed through the rest of Smith et al. (2020).

2. Watts, D. J., Rothschild, D. M., & Mobius, M. (2021). Measuring the news and its impact on democracy . PNAS, 118(15).

PP. 1-2

## 2021-04-13 #

1. Finished Olgado et al. (2021)

This paper, utilizing document theory, reveals how dating apps make money. The authors conclude:

…(D)ating profiles are not people but rather documents with a “casting mold” designed for profit extraction, perpetuating structural inequalities beyond intimate discrimination which incremental design changes inadequately addresses.

1. Smith, C. E., Yu, B., Srivastava, A., Halfaker, A., Terveen, L., & Zhu, H. (2020, April). [Keeping Community in the Loop: Understanding Wikipedia Stakeholder Values for Machine Learning-Based Systems. In 2020 CHI (pp. 1-14).

PP. 1-4

## 2021-04-12 (Completed on 2021-04-13) #

Olgado, B. S., Pei, L., & Crooks, R. (2020, April). Determining the Extractive Casting Mold of Intimate Platforms through Document Theory . In 2020 CHI (pp. 1-10).

PP. 1-5

## 2021-04-11 #

Finished Jacobs et al. (2014)

One thing I can remember after reading this paper is that technology literacy sometimes may inhibit, rather than help, the adoption of a digital gadget. For people who work in front of a screen all day long, they probably won’t like to face another screen at home.

## 2021-04-10 #

Jacobs, M. L., Clawson, J., & Mynatt, E. D. (2014, April). My journey compass: a preliminary investigation of a mobile tool for cancer patients . In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 663-672).

PP. 1-6

## 2021-04-09 #

Finished Weng et al. (2012)

I didn’t totally understand the details of this paper but I got the idea. To account for the pattern of meme popularity, meme lifetime, and user activity, The combination of (i) social media structure and (ii) the limited user attention is sufficient. The authors did not prove that the values of memes are not a factor in their popularity. However, they showed that it’s possible to predict the pattern of memes on Twitter without considering the intrinsic values of memes.

Fig. 2 shows that a user’s “breadth of attention” is constant despite the growing number of posts/memes in social media.

Fig. 3 shows that people are more likely to retweet something related to what they posted in the past. This means that memory is an important factor in meme competition.

## 2021-04-08 #

1. Finished Park et al. (2019)

2. Weng, L., Flammini, A., Vespignani, A., & Menczer, F. (2012). Competition among memes in a world with limited attention. *Scientific reports, 2(*1), 1-9.

PP. 1-5

## 2021-04-07 #

Park, J., Wood, I. B., Jing, E., Nematzadeh, A., Ghosh, S., Conover, M. D., & Ahn, Y. Y. (2019). Global labor flow network reveals the hierarchical organization and dynamics of geo-industrial clusters . Nature communications, 10(1), 1-10.

PP. 1-7

## 2021-04-06 #

Bjerre-Nielsen, A., Kassarnig, V., Lassen, D. D., & Lehmann, S. (2021). Task-specific information outperforms surveillance-style big data in predictive analytics . PNAS, 118(14).

Task-specific data based on historical outcomes has a better predictive ability than big data, which usually comes with the cost of privacy.

## 2021-04-05 #

Finished Guess et al. (2021)

Those who set their desktop browser homepage to Huffington Post visited the Huffington Post website more. Those who set the homepage to Fox News increased their visits to the Fox News website. Treatment had little impacts on participants' political opinions, but decreased participants' trust in mainstream media up to one year after the experiment.

## 2021-04-04 #

… [B]igger tech companies are usually much better at hiding what they actually do with user data, while restricting users from having control and oversight over their own data. Once you give, there is no taking back.

1. Guess, A. M., Barberá, P., Munzert, S., & Yang, J. (2021). The consequences of online partisan media . PNAS, 118(14).

PP. 1-3

## 2021-04-03 (Completed on 2021-04-04) #

Finished Banet-Weiser, S., & Portwood-Stacer, L. (2006)

## 2021-04-02 #

Continued with Banet-Weiser, S., & Portwood-Stacer, L. (2006)

PP. 1-8

## 2021-04-01 #

Banet-Weiser, S., & Portwood-Stacer, L. (2006). ‘I just want to be me again!’ Beauty pageants, reality television and post-feminism . Feminist Theory, 7(2), 255-272.

pp.1

# 2021-03 #

## 2021-03-31 (Completed on 2021-04-01) #

Finished Guo, 2021. A mid-timer’s thoughts on publishing academic papers

### “A conversation with the past in your field” #

A research paper is “a conversation with the past in your field”. “with the past” means you need to show the connection of your present paper to the progress made in the past; “in your field” means that you need to show how your work contributes to the development in your specific academic field.

Remember that a research paper is not a lab report, not a technical blog post, not a how-to guide to software, not a news article, not a popular science article for the non-academic folks. To be accepted, your work has to show its connection to the past and its contributions to the development within your chosen specific field.

### “Calibrating yourself to the norms of your field” #

You need to read papers in your own field published in the past five years to get an idea of how other people in your field write papers. Innovating a way to write for research papers without invitation “is not stylish, it is just rude” (Lukeš, 2021 ).

To calibrate yourself may take years. Working with your advisor can speed up this process. By observing how your advisor writes the paper, you can absorb the writing norms of your field much faster.

The first advice you need to give to an academic writer is not to read a book on stylish writing but rather to read how people in their field are writing.

### Writing consistently and regularly #

Don’t expect a perfect time to write. It doesn’t exist. When you write, first focus on the number of words. That’s the easiest thing. Don’t worry about the quality in the beginning. Just get the draft done and then you can improve it. If you expect your first few sentences to be perfect, you’ll never start writing.

### Pay close attention to the introduction #

Always ask yourself, “so what?” You’ve done this study, so what? What problems does it solve? Why should it be published anyway? Who cares?

The introduction should contain the soul of your paper. Keep tweaking it until the submission deadline. Many papers lack a soul because the writers don’t spend enough time thinking/writing about the motivation and the value of their work. They submitted anyway. That’s too bad.

### Title, metadata, and abstract #

Craft the title and abstract, and choose the right boxes to check (when submitting) in a way that can help your paper to be in the hands of the right people: scholars who have expertise in what you do. They want to see their academic lineage “carry on through the papers of newcomers like yourself”. By contrast, those outside of your specific (sub)field do not appreciate your work and don’t have incentives to have it published.

### Quantity and quality #

Quantity and quality aren’t that different. Those who publish good work are also those who publish a lot of work. Therefore, don’t procrastinate. Come up with ideas, do research, write, submit, revise, and publish. Just write! Writing will never be easy. You have to go through those miseries to get things done.

### Acceptance and awards are random, but not that random #

Don’t submit papers you know aren’t good enough. Don’t waste reviewers' time. However, keep in mind that even the best work can be rejected. There is much randomness involved.

That said, good papers always have higher chances of acceptance and winning awards.

## 2021-03-30 #

1. Reviewed Guo (2013)

### 1. Research statement #

Research statement: 4 pages, single-spaced, 12-point, 1.25-inch margin; The first 1.5 pages to describe your previous work, the next 1.5 pages to show your future research ideas, and the last page for a bibliography.

### 2. Scheduling interviews #

• Don’t let your top choice be your first interview, as you’ll most likely screw up the first one.

• Never schedule more than one interview per week. You need time to relax.

• To avoid delays, book direct flights.

• When travelling on the plane, you can dress your shoes and coat for the interview, in case your luggage gets lost.

### 3. Preparing for interviews #

• For your job talk, practice, practice, practice. You need to practice a lot! You can practice in front of your colleagues and friends, who can then give you suggestions for improvement.

• For one-on-one interviews, prepare a document of talking points. See Section 4.2 for more details. Whenever a professor asks you questions not in this document, add this question to it.

### 4. Interviewing #

• Pay attention to and be aware of why they brought you there in the first place. Is it because your research fits into the department’s future agenda? Play up these parts in your talk and individual meetings.

• Try to Engage, not to impress. Professors interviewing you have much more research experience and publications than you do. There is no need to impress others.

#### 4.1 Job talk #

• If you cannot hook the audience in the first 10 minutes, chances are your talk will just be mediocre.

• NEVER say anything impromptu, i.e., only say what you have practiced! And NEVER exceed the time limit. Professors are busy and no one wants to spend more time than necessary on your talk. You should be mindful of the time limit of your talk even when you are answering questions from the audience.

• What your job talk is supposed to show:

1. “Solid past research”. Start with the overarching motivation before diving into two specific projects you are most proud of.
2. “Compelling future research”. Dedicate the final 10 minutes to this part. Be bold.
3. Capability + Personality. Through your talk, the audience should know that you are the person who will be successful academically and you will be a chill colleague.
• You don’t need to constantly tweak your talk.

#### 4.2 One-on-one meetings with faculty #

• You are supposed to chat casually with 10-20 faculty members in the department. Each chat is around 30 minutes.

• Be a good listener and Let the other person lead. Try to listen to and talk about their research, not yours.

#### 4.3 Department chair (or Department head) meeting #

This is the most important meeting you’ll have. Again, be a good listener and let the other person lead the meeting. Try to listen to and talk about the department’s future vision. You can mention how that vision connects to your future research agenda.

#### 4.4 Dean meeting #

The dean is probably not involved in the hiring decisions. So just be chill.

#### 4.5 Graduate students meeting #

Be nice to students. Engate, inspire, and help them!

#### 4.6 Mealtime conversations #

• It’s still part of the interview!

• Ask some questions about living in this area, like the housing price. This shows your genuine interest in working there.

### 5. Job talk revisited #

• Again, never say anything impromptu.

• Don’t linger on slides. Just click “Next”!

• When switching to the next slide, don’t say anything about it before you advance to it.

• Minimize the use of a laser pointer.

### 6. One-on-one meeting with faculty revisited #

• You can take notes while talking, but only write down key words that help you write follow-up emails later.

• Remeber to let the other person lead the meeting. Be comfortable with short sliences. It’s better to let them control the meeting. Be a good listener.

• During conversations with faculty, try to make connections with them. When they lead the talk and talk about their research, you can mention how their studies relate to yours.

• Why are CONNECTIONS super important? Because if there are no connections at all, they won’t even remember you and won’t have anything to say about you after the meeting.

• Try HARD to appeal to your “talk mates”, even if they are in totally different subfields than yours. Everyone’s vote accounts in hiring decisions. You cannot afford losing anyone’s vote.

• Be open, sincere, and honest. If they ask you where you are also applying besides their institution, let them know openly.

• There is no need to spend TOO MUCH time getting to know every one of your “talk mates” beforehand. They know you are busy and don’t expect that you know about their research in detail.

• People sign up for meetings with you for a reason. For example, they may want to collaborate with you; they want to know whether your research strengthens the department research portfolio; or that they are on the faculty search committee. Only when you let the other party lead the meeting, and be a good listener, can you know what they want from you and then adapt accordingly.

• Follow-up emails to people you talk to are VERY important.

### 7. Other suggestions #

• Be nice to the department admin. They’ll let everybody know if they really hate you.

• Try to focus on your job search. It’s very important. It’s OK not to have time to do research when you are doing tons of interviews.

• When you get the offer from the school. Don’t sound demanding when you negotiate. Don’t be arrogant.

1. Guo, P. (2021). A mid-timer’s thoughts on publishing academic papers .

PP. 1-2

## 2021-03-29 #

Reviewed Guo (2013)

## 2021-03-28 #

Finished Guo (2013)

## 2021-03-27 #

Continued with Guo (2013).

PP. 8-18

## 2021-03-26 #

Guo, P. (2013). Philip’s notes on the tenure-track assistant professor job search.

PP. 1-8

## 2021-03-25 #

• When you become a first-year assistant professor, you’ll have a lot of free time. You only need to dedicate roughly six hours per week (3 hours to teaching, and 3 hours to department and committee meetings). You are free to do whatever you want for the rest of your time each week. This freedom is the challenge: you have to learn how to manage this free time.

• Tenure. From the first day of your job, you have about five years to build up your publication list, submit your material in the sixth year, and hopefully get tenure. To get tenure, you need (1) to get recommendation letters from full professors detailing why you are the best in the field you are in; and (2) to have good relationships with tenured colleagues in your department so that they will interpret the recommendation letters in the most favorable light. For the first requirement, if your research is super innovative, then you risk not being able to find enough recommenders in your “field”.

• Even though tenure is a big thing, you shouldn’t wishfully think that things will be better after you get tenure. Also, don’t do something you don’t enjoy simply to get tenure with the hope that you can do what you really love after tenure. Rather, be true to yourself, do what you love right now, and have a lifestyle you desire right now.

• Back to time management, the key is to prioritize. Even though your life as a professor seems free, you have tons of things to do. Your time and energy is limited, so you have to prioritize the most important things and give up other no-so-important things. Otherwise, you won’t have focused attention and energy to do your job well.

• No need to be stressed out by tenure. It’s easier to get tenure than to get fired as faculty.

## 2021-03-23 #

Martinez, W. (2018). How science and technology developments impact employment and education . PNAS, 115(50), 12624-12629.

This paper is very different from many papers I read, possibly because the author is an employee of the government. It aims to provide information for researchers interested in one topic, rather than to show a scientific result obtained through the author’s research. It’s an eye-opener for me, since it lets me know a different way of writing a paper. But I guess this format is very rare in the scientific community. It’s more like a lab report.

## 2021-03-22 #

Finished Pennycook, G., Epstein, Z., Mosleh, M., Arechar, A. A., Eckles, D., & Rand, D. G. (2021). Shifting attention to accuracy can reduce misinformation online . Nature. https://doi.org/10.1038/s41586-021-03344-2

### Study 1 #

• Purpose: To see whether mistaken belief is sufficient to explain the sharing of misinformation.

• Procedure: 1015 participants recruited through Amazon Mechanical Turk. Participants are presented with headlines, lead sentences, and images for 36 news stories obtained from social media. Half of these stories were entirely false and the other half are true. Half of the headlines are favorable to Democrats and the other half favorable to Republicans. Participants were randomly assigned to decide either the veracity of the headline they saw (accuracy condition) or their intention to share the story (sharing condition).

• Results:

• In the accuracy condition, true headlines are rated as accurate significantly more often than false headlines. Whereas politically concordant headlines are rated as accurate more than politically discordant headlines, this difference stemming from political alignment is significantly smaller than the veracity-driven difference.

• In the sharing condition, the difference in sharing intention based on political alignment is significantly larger than the veracity-driven difference.

• Across conditions: the effect of headline veracity is significantly larger in the accuracy condition than in the sharing condition; the effect of political alignment is significantly larger in the sharing condition than in the accuracy condition.

There is a disconnect between accuracy rating and sharing intention: even though very few Republicans rated a headline as accurate in the accuracy condition, many Republicans in the sharing condition intend to share it. This disconnection disputes the confusion theory (which states that people share misinformation because they do not know the information is false): People can detect the falsehood, but decide to share it nonetheless.

People do care about veracity when deciding whether to share or not, according to a study (Coppock & McClellan, 2019). Then, why do people knowingly share misinformation? The authors of the current study formulate the idea that the context of social media primed people to please followers or signal their membership.

### Study 3 & Study 4 #

Control condition: participants are shown 24 news headlines (half true & half false; half favorable to Democrats & half favorable to Republicans) and asked how likely they are to share each headline on Facebook.

Treatment condition: participants are asked to rate the accuracy of one single non-partisan news headline at the start of the experiment and then go on to go through the same procedure as in the control condition.

As shown in Figure 2, participants in the treatment condition are significantly less likely to consider sharing false news headlines than those in the control condition. This treatment effect is significantly larger for politically concordant headlines than for politically discordant news. However, participants in the two conditions are equally likely to consider sharing true headlines.

In Study 3, the difference in sharing intention for veracious versus fake headlines is 2.0 times bigger in the treatment condition relative to the control condition. This figure is 2.4 in Study 4.

In both studies, participants in both conditions when asked after the experiment do not differ significantly in how important it is to share only veracity.

### Study 5 #

A more representative group of participants are recruited through Lucid.

Besides the two conditions in Study 3 & 4, there are also an active condition (in which participants are asked to rate the humorousness rather than the veracity of a single news headline before the experiment) and an importance treatment condition (in which participants before the experiment are asked about the importance they attached to only sharing true content). The results replicated those in Study 3 and 4. See Fig. 2-c.

As Fig. 3 a-c shows, there exists a positive correlation between the perceived accuracy (rated categorically as 1-7, with 7 indicating the most accurate) and the treatment effect in Study 3, 4, & 5. For each headline, the treatment effect is measured as the $Sharing intention_{treatment condition} - Sharing intention_{control condition}$.

This positive correlation means that the least accurate news headlines are the ones that the “accuracy salience treatment” discouraged people from sharing the most.

Study 7: I do not understand it. It seems that this part is written by a different person. But I know the conclusion of Study 7: the accuracy message led Twitter users who frequently share misinformation to increase the quality of the content they share.

Overall, this article supports the idea that shifting people’s attention to accuracy discourages them from sharing misinformation.

## 2021-03-21 #

Continued with Pennycook et al. (2021)

PP. 3

## 2021-03-20 #

Continued with Pennycook et al. (2021)

PP. 2-3

## 2021-03-19 #

Pennycook, G., Epstein, Z., Mosleh, M., Arechar, A. A., Eckles, D., & Rand, D. G. (2021). Shifting attention to accuracy can reduce misinformation online . Nature. https://doi.org/10.1038/s41586-021-03344-2

PP. 1-2

## 2021-03-18 #

1. Lamott, A. (1994). Shitty first drafts . Writing about writing: A college reader, 527-531.

I agree. The most important thing about writing is getting the first draft down, even with tears on your face. Don’t procrastinate. Don’t expect the first draft to be elegant. You just need to get enough words, which will make you less anxious (as you now have a draft, as shitty as it is). Then you can spend more time later editing it.

1. Caçola, P. (2013). Patricia Goodson: Becoming an academic writer: 50 exercises for paced, productive, and powerful writing. Higher Education, 65(6), 785.

The trick here is that you need to separate generating text from editing. These two activities involve different activities.

When generating texts, don’t look at the words you’ve produced, and don’t think about the quality. Just write down what’s in your mind. Editing is another job you’ll do later, not now.

After generating enough texts, then you can edit them.

## 2021-03-17 #

Finished Türkay et al. (2020)

## 2021-03-16 #

Continued with Türkay et al. (2020)

PP. 1-5

## 2021-03-15 #

1. Finished Taber & Whittaker. (2020)
• People tend to post positive things in their Rinsta (i.e., real Instagram accounts) and negative (but authentic) things in their Finsta (i.e., false Instagram accounts). They want to present a polished and emotional-free self on their Rinsta. Self-presentation can be coarse and emotional on Finsta. People might also talk about taboo activities on their Finsta.

• People treat Rinsta and Finsta differently because they are targeting different audiences.

• Since people might have multiple accounts in the same social media and use these accounts differently, surveying people their experiences of using this medium might lead to wrong conclusions.

1. Türkay, S., Formosa, J., Adinolf, S., Cuthbert, R., & Altizer, R. (2020, April). See no evil, hear no evil, speak no evil: How collegiate players define, experience and cope with toxicity. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-13).

PP. 1

## 2021-03-14 #

Continued with Taber & Whittaker (2020)

PP. 4-8

## 2021-03-13 #

1. I skimmed through Moran et al. (2021) and found it was too difficult for me to understand. I gave it up.

2. Taber, L., & Whittaker, S. (2020, April). '' On Finsta, I can say Hail Satan'": Being Authentic but Disagreeable on Instagram . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-14).

PP. 1-4.

## 2021-03-12 #

1. Finished McDonald et al. (2021)

The authors trained their machine learning model with the behavior and characteristics of known vessels associated with forced labor. They then used this model to predict high-risk vessels among 16,000 industrial vessels, utilizing these vessels' satellite monitoring data. The results showed that 14% - 26% of these vessels are of high risk, and 57,000 - 100,000 people worked on these vessels.

This study is very interesting. It shows how machine learning tools can be used to improve the life of people.

1. Moran, R., Dayan, P., & Dolan, R. J. (2021). Human subjects exploit a cognitive map for credit assignment . PNAS, 118(4).

PP. 1

## 2021-03-11 #

1. Finished Brashier et al. (2021)

It’s more effective to put corrections after the headlines of fake news, compared to putting them before or during the headlines.

1. McDonald, G. G., Costello, C., Bone, J., Cabral, R. B., Farabee, V., Hochberg, T., … & Zahn, O. (2021). Satellites can reveal global extent of forced labor in the world’s fishing fleet . PNAS, 118(3).

PP. 1-4

## 2021-03-10 (Completed on 2021-03-11) #

1. Finished Zhu et al. (2013)

I skimmed through the rest of the article. I could not fully understand it. My background in network science is still very weak.

1. Brashier, N. M., Pennycook, G., Berinsky, A. J., & Rand, D. G. (2021). Timing matters when correcting fake news . PNAS, 118(5).

PP. 1

## 2021-03-09 (Completed on 2021-03-10) #

1. Simoneschi, D. (2021). Opinion: We need to improve the welfare of life science trainees . PNAS, 118(1).

2. Zhu, Y. X., Huang, J., Zhang, Z. K., Zhang, Q. M., Zhou, T., & Ahn, Y. Y. (2013). Geography and similarity of regional cuisines in China . PloS one, 8(11), e79161.

PP. 1

## 2020-03-08 (Completed on 2021-03-10) #

1. Finished Yu et al. (draft)

I skimmed through the rest.

Conclusions: (1) Feeling happy did not decrease people’s affective polarization; (2) Feeling happy did not make people more likely to believe in conspiracy theories or less likely to consider deep fake as fake.

1. Casas, A., Menchen-Trevino, E., & Wojcieszak, M (draft). Exposure to extremely partisan news from the other political side shows scarce boomerang effects .

The study found that exposure to extreme news of the opposing ideology did not make people more polarized.

## 2020-03-07 #

1. Finished Richmond-Rakerd et al. (2021)

2. Yu, X., Wojcieszak, M., Lee, S., Casas, A., Azrout, R., & Gackowski, T. (draft) The (null) effects of happiness on affective polarization, conspiracy endorsement, and deep fake recognition: Evidence from five survey experiments in three countries.

PP. 1-9

## 2021-03-06 #

1. Finished Mastroianni et al. (2021)

2. Richmond-Rakerd, L. S., Caspi, A., Ambler, A., d’Arbeloff, T., de Bruine, M., Elliott, M., … & Moffitt, T. E. (2021). Childhood self-control forecasts the pace of midlife aging and preparedness for old age . PNAS, 118(3).

• Self-control in childhood is related to slower aging of the body, lower scores of brain ageing, appearing younger in photos, better prepared for later-life financially and in terms of social/emotional support. The effects of self-control in these associations are independent of socioeconomic origin and IQ.

• People’s self-control is subject to change. Self-control in adulthood was associated with this slower pace of aging, and better health, financial, and social preparedness, even controlling for self-control in childhood.

PP. 1-7

## 2021-03-05 (Completed on 2021-03-06) #

1. Finished Buyalskaya et al. (2021)

2. Mastroianni, A. M., Gilbert, D. T., Cooney, G., & Wilson, T. D. (2021). Do conversations end when people want them to?. PNAS, 118(10).

PP. 1-7

## 2021-03-04 #

Continued with Buyalskaya et al. (2021)

PP. 3-9

## 2021-03-03 #

1. Finished Rakita et al. (2019)

I skimmed through it.

1. Buyalskaya, A., Gallo, M., & Camerer, C. F. (2021). The golden age of social science . PNAS, 118(5).

PP. 1-3

## 2021-03-02 #

Rakita, D., Mutlu, B., Gleicher, M., & Hiatt, L. M. (2019). Shared control–based bimanual robot manipulation . Science Robotics, 4(30).

PP. 1-6

## 2021-03-01 #

1. Finished Kim et al. (2020)

2. Hutchinson, S. (2010). Surviving the Review Process [Editor’s Corner] . IEEE Robotics & Automation Magazine, 17(4), 101-104.

• As an author, you should shoulder the responsibility of convincing the reader of the contribution of your work.

• Some reviewers may conclude that your paper should be rejected before they read all of your content. Therefore, their criticism will only be for the first few pages. It is wrong to conclude that they are satisfied with the remaining part of your paper.

• If reviewers comment that your work lacks a thorough literature review, you should not assume that adding some references will solve the problem. This kind of criticism implies that they don’t think your work is important in the context of extant research.

• Reviewers might send confidential comments to editors and associated editors, which are invisible to you.

• If your paper is rejected by a journal and you decide to submit it to another journal after revision, you should send a note to the editor explaining the history of your work. Do not simply do “journal shopping”.

• Be objective, emotionally detached, and friendly in your response to reviewers' comments. No matter how offended you feel, do not insult reviewers' competence or their motives. Be “gracious, humble, and constructive”.

• If reviewers do not appear to understand your paper, take the responsibility, explain, clarify, and educate, rather than attacking reviewers' competence.

• Even if you feel a criticism is unjustified, and no revision for it is necessary, it’s better to make relevant changes according to the criticism and describe your improvements. You can make your points clearer if you feel they are correct, rather than arguing no revision is needed, which sounds arrogant to the reviewers.

# 2021-02 #

## 2021-02-28 #

Continued with Kim et al. (2020)

• Experiment 2: Showing people information about variability in individual outcomes alleviates people’s overestimation of treatment effectiveness when presented only with mean differences (which was conveyed through 95% CI with visualizations, the most misleading format in Experiment 1). Among all methods to convey outcome uncertainty, telling people directly the probability of superiority is the most effective (at alleviating overestimation).

• Experiment 3: Combining the method of “telling people directly probability of superiority” with any of the other four methods (category, variance, height analogy, and weather analogy) did not statistically significantly reduce people’s willingness to pay. Adding additional information did not have negative impacts either (except when weather analogy was applied).

• Experiment 4: Even if mean differences are conveyed through four other methods (in Experiment 1) rather than 95% CI with visualizations, telling people afterwards directly probability of superiority reduces their willingness to pay, compared to when showing people only inferential uncertainty through 95% CI with visualizations.

PP. 8-10

## 2021-02-27 (completed on 2021-02-28) #

Continued with Kim et al. (2020)

• Aim of this study: to find out how to best communicate effect size to lay people.

• Experiment 1: Showing mean differences between two groups through 95% confidence intervals (which represent inferential uncertainty) with visualizations led people to overestimate treatment effectiveness the most, compared to other methods.

PP. 3-7

## 2021-02-26 #

Kim, Y. S., Hofman, J. M., & Goldstein, D. G. (2020). Effectively Communicating Effect Sizes. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI’20). Association for Computing Machinery, New York, NY, USA.

• Outcome uncertainty is more useful for individuals than inferential uncertainty. The former is predicting what an individual outcome is likely to be, whereas the latter one shows the group average. It’s possible that the group with a larger mean has more variability, whereas the other group (with a smaller mean) is more reliable. Wise people will choose the second one rather than the first one.

• Cohen’s d, expressed as $d = \frac{\mu_1 - \mu_2}{\sigma}$, conveys information about both the treatment effect and the variability in individual outcomes.
• Probability of superiority: how often a randomly chosen individual in one group scores higher (or slower) than a randomly chosen individual in another group.

PP. 1-3

## 2021-02-25 #

Finished Hofman et al. (2020)

• People tend to overestimate the effect size and understated the variability in outcomes when presented 95% confidence intervals that show inferential uncertainty, compared to when presented 95% prediction intervals that show outcome uncertainty.

For example, suppose we are comparing heights of men with those of women:

• Inferential uncertainty is how confident we are about our estimation of each group’s average height, based on our measurement. So error bars will extend 1.96 (sometimes 1.0) standard error above and below the mean.

• Outcome uncertainty is the variability of each individual’s height around their group’s mean. Therefore, error bars will extend 1.96 (sometimes 1.0) standard deviation above and below the mean.

See Fig. 1 for illustration.

• People have the largest misperceptions when the effect size is small. Unfortunately, small effect sizes are the norm in scientific studies, which use 95% CI more often than 95% PI. This indicates that readers are likely to exaggerate the scientific results they encounter.

I like this study pretty much. It’s a little long but worth the reading.

## 2021-02-24 #

Continued with Hofman et al. (2020)

PP. 4-7

## 2021-02-23 #

Hofman, J. M., Goldstein, D. G., & Hullman, J. (2020, April). How visualizing inferential uncertainty can mislead readers about treatment effects in scientific results . In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-12).

PP. 1-4

## 2021-02-22 #

Hao, K. (2019). We analyzed 16,625 papers to figure out where AI is headed next . MIT Technology Review.

Artificial intelligence is powered by deep learning, which are algorithms that use statistics to identify patterns in data. This capability enables computers to mimic human skills, and to provide recommendations to users (for example, in Google, Facebook, and Netflix).

Papers in arXiv’s artificial intelligence section focused on machine learning in around 1995-2005, on neural networks in around 2010-2015, and on reinforcement learning in around 2015-2019.

Before around 2005, papers in arXiv’s AI section focused on “knowledge-based reasoning”, trying to recreate human seasoning using man-made rules. After around 2005, the attention changed to machine learning, which is a parent term that includes deep learning. Rather than writing rules, machine learning trains computer programs to extract rules from a sea of data.

Besides neural networks, there are other machine learning methods, for example, “bayesian networks”, “markov methods”, and “support vector machines”, but neural networks have dominated the playground since around 2016.

Three types of machine learning: supervised, unsupervised, and reinforcement learning. Reinforcement learning wasn’t a new idea, but it didn’t gain much momentum until AlphaGo’s groundbreaking success.

## 2021-02-21 #

Continued with Kim et al. (2020)

## 2021-02-20 #

Continued with Kim et al. (2020)

People overestimated both error and variance of effect size when only told the mean differences. 95% confidence intervals with visualizations led to the largest overestimation, as shown in Figure 2. Presending the mean differences in absolute or percentage terms, or simply telling people difference exists, made people overestimate the effect size as well, but to a lesser degree.

Informing people the variability helped alleviate this issue, with directly telling people the probability of superiority performing the best (although the difference did not reach statistical significance).

I do not really understand this paper. Will come back later.

## 2021-02-19 (completed on 2021-02-20) #

Kim, Y. S., Hofman, J. M., & Goldstein, D. G. (2020). Effectively Communicating Effect Sizes . In Computation + Journalism Symposium 2021 .

PP 1-4

## 2021-02-18 #

Completed Sekara et al. (2018)

• As Fig. 1(B) shows, the percentage of publications in both Nature and Physical Review D by new PIs have been decreasing in the past two decades. The figure also shows that different types of journals have different patterns regarding the share of papers by different categories of PIs.

• Chaperone effect $C$ is measured as

$$c / c_{random}$$

where $c$ is the ratio of the number of authors who became last authors from non-last authors to the number of those who never made this transition. $c_random$ is the same ratio but in a system where the order of authors is randomly permuted. This randomization is necessary if we want to compare the chaperone effect of different scientific fields. If we measure the chaperone effect by $c$ rather than $C$, we fail to consider the fact that $c$ might be influenced by individual productivity, team size, or simply random factors.

If $C$ is larger one 1, then the chaperone effect exists.

• Fig. 2 shows that the chaperone effect is strongest in interdisciplinary fields, biology, and medicine, and weakest in Mathematics.

• An interesting question to ask is whose papers have more impact, those by new PIs, or those by chaperoned and established PIs. The authors assumed that those by new PIs have more impact because it is difficult to publish in a journal, say, in Nature, if you didn’t have a publication record there. However, results show that papers by established and chaperoned PIs receive more citations.

• The chaperone effect is stronger in prestigious, interdisciplinary journals with a general audience than in field-specific journals.

## 2021-02-17 (completed on 2021-02-18) #

Sekara, V., Deville, P., Ahnert, S. E., Barabási, A. L., Sinatra, R., & Lehmann, S. (2018). The chaperone effect in scientific publishing. PNAS, 115(50), 12603-12607.

• Three kinds of principal investigators (PIs, last author in a paper):

1. New PIs who have never published in a specific journal.
2. Chaperoned PIs who have published in a journal previously as nonlast authors.
3. Established PIs who have published in a journal before as last authors.

## 2021-02-16 (completed on 2021-02-17) #

Continued with Strother et al. (2021)

Understanding the methodology:

• Whether students' political ideology changed significantly: compare the ideology score in Table 1 using a two-tailed t test.

• Whether roommates' ideology predicted students' ideology changes: using OLS regressions with roommates' ideology at wave 1 as the independent variable, students' ideology at wave 2 as the dependent variable, and students ideology at wave 1 as the control variable. The authors also included other control variables in different models, as can be seen in Table 3 . The range of p values was (0.012, 0.069). The association is significant, but its effect size is small.

I did not understand the method which focuses on “students assigned to roommates who had different baseline political views” (P. 4)

Will come back to review later.

## 2021-02-15 (completed on 2021-02-16) #

Strother, L., Piston, S., Golberstein, E., Gollust, S. E., & Eisenberg, D. (2021). College roommates have a modest but significant influence on each other’s political ideology. PNAS, 118(2).

• Table 1 shows that first-year college students tend to be liberal (compared to conservative), which is consistent with common understanding. However, contrary to popular claims, after the first year, these students became a little bit more conservative, rather than libral.

• Living with a roommate of a different ideology has statistically influenced students' ideology change (moved closer to their roommates).

PP. 1-4

## 2021-02-14 #

Finished Kubin et al. (2021)

• Perceptions of rationality mediated the relationship between stance based on personal experiences and increased respect.

• When someone disagrees with you on topics related to morality or politics, he or she has more doubts about your argument if it is based on facts (compared to personal experiences).

## 2021-02-13 #

Kubin, E., Puryear, C., Schein, C., & Gray, K. (2021). Personal experiences bridge moral and political divides better than facts . PNAS, 118(6).

• People who base their viewpoints on personal experiences (rather than on facts) are considered more rational and are respected more by their opponents. Opponents are also more willing to interact with them.

• To increase perceived rationality, personal experiences are better to be relevant, and harm-based.

• To foster respect, personal experiences are better to be “personal”.

PP. 1-4

## 2021-02-12 #

Gates, A., Gysi, D., Kellis, M., & Barabási, A. L. (2021). A wealth of discovery built on the Human Genome Project — by the numbers . Nature.

• 22% of publications on genes referenced only 1% of all genes. This might be due to preferential attachment (“rich-gets-richer”). Risk-averse researchers and funders might have been afraid of exploring uncharted territories.

• Complexity lies in the interactions of individual components. Understanding components is necessary, but not sufficient, to know a system.

## 2021-02-11 #

Schulz, L., Rollwage, M., Dolan, R. J., & Fleming, S. M. (2020). Dogmatism manifests in lowered information search under uncertainty . PNAS, 117(49), 31527-31534.

I did not really understand the details of this paper. That said, the conclusion is clear:

Dogmatic individuals are less likely to seek out additional information, especially when their initial decisions are uncertain. This is worrisome because after dogmatic people encounter fake news, they are less likely to seek out correcting pieces.

## 2021-02-10 #

Choi, S. H., Rao, V. D., Gernat, T., Hamilton, A. R., Robinson, G. E., & Goldenfeld, N. (2020). Individual variations lead to universal and cross-species patterns of social behavior . PNAS, 117(50), 31754-31759.

I could not understand the details in this paper.

## 2021-02-09 #

Finished Killingsworth (2020)

• As can be seen in Fig. 1 , both current happiness (experienced well-being) and remembered happiness (evaluative well-being) grow linearly with log(income), without a plateau.

• Larger household income below $80,000 had a stronger correlation with decreased negative feelings compared to people having household income above$80,000. Positive feelings grow evenly across the income range.

• Sense of control of one’s life explained 74% of the relationship between income and experienced well-being.

• People having smaller household income were happier if they attributed less importance to money; Those who earned a lot were happier if they attributed more importance to money.

• Across the income range, the more people equated money and success, the less happier they felt.

• The main difference of this study is that 1) respondents responded in real time when they saw the prompts from the app; 2) happiness was measured using a continuous rather than dichotomous scale.

I like this study.

See this criticism.

## 2021-02-08 #

1. Finished Stockard et al. (2021)
• Non-URM (underrepresented minorities) men reported the greatest professional support from peers and postdocs, followed by non-URM women.

• URM students were twice as likely to report that they did not receive enough financial support for living.

• Men were more likely than women to express greater commitment to completing the PhD and continuing research in the chemistry field. Amazingly, within each gender group, URM students were more likely to do so.

• When we do not consider other factors, students having a supportive advisor were more likely to finish the PhD, find a postdoc, and have an academic career at a research institute. However, this positive effect on women is not found in bigger and more renowned chemistry departments.

1. Killingsworth, M. A. (2021). Experienced well-being rises with income, even above $75,000 per year. PNAS, 118(4). PP. 1-2 ## 2021-02-07 # 1. Finished Chen at el. (2020) I skimmed through the rest of this paper. 1. Stockard, J., Rohlfing, C. M., & Richmond, G. L. (2021). Equity for women and underrepresented minorities in STEM: Graduate experiences and career plans in chemistry. PNAS, 118(4). • Women who are underrepresented minorities (URM) were the least satisfied with their advisor-student relationship. Other women were the next least satisfied. URM men were the most satisfied with the relationship. PP. 1-3 ## 2021-02-06 # Chen, Y., Jiang, M., & Kesten, O. (2020). An empirical evaluation of Chinese college admissions reforms through a natural experiment . PNAS, 117(50), 31696-31705. PP. 1-5 ## 2021-02-05 # Finished Sterling et al. (2020) ### Results # • Prior to graduation, the level of self-efficacy of women is lower than that of men. • Without considering any other factors, self-efficacy is positively related to salary in initial jobs. • For engineering and CS undergraduates, in their first jobs after they graduate, women’s salaries are lower than men’s. • Gender does not predict how important salary is to a person. • Women do emphasize workplace culture more, but this emphasis is associated with higher rather than lower compensation. • Self-efficacy is a significant mediator for the relationship between being female and the salary. (See Figure S5 ) • Self-efficacy influences whether a person intends to enter jobs related to engineering and computer science. ## 2021-02-04 # Sterling, A. D., Thompson, M. E., Wang, S., Kusimo, A., Gilmartin, S., & Sheppard, S. (2020). The confidence gap predicts the gender pay gap among STEM graduates. PNAS, 117(48), 30303-30308. This paper investigates why there is a gender pay gap among STEM graduates. They hypothesized that engineering self-efficacy (ESE) is the reason. To test this hypothesis, they surveyed 559 college students majoring in engineering and computer science in 2015 (when students are still enrolled in the program), 2016, and later in 2017 when students just graduated and entered the workforce. ### Theories # Two possible reasons why women earn less: 1) women are socialized to believe money is not as important as family; 2) women prefer inclusive environments to competitive ones, with the latter more favorable to higher salaries. ## 2021-02-03 # Andrasfay, T., & Goldman, N. (2021). Reductions in 2020 US life expectancy due to COVID-19 and the disproportionate impact on the Black and Latino populations. PNAS, 118(5). • The COVID-19 pandemic is projected to cause a decline in 2021 US life expectancy by 1.13 y. This decline is larger than that in other developed countries, which already had higher life expectancy than the US before the pandemic. • The reduction in life expectancy in the US is not the same for each racial group. The reduction for Black (2.10 y) and Latino (3.05 y) is much larger than that for Whites (0.68 y). • For Black and Latino populations, younger people face a higher burden of mortality related to the COVID-19. This might be because their jobs are less compatible with remote working and therefore they have to expose themselves to viruses to earn money during the pandemic. • These racial differences in reduction in life expectancy will result in 39% increase in the life expectancy gap between the Black population and Whites (from 3.6 y in 2017 to 5.06 y in 2020). The life expectancy advantage that Latino population has relative to Whites will decrease from 3.3 y in 2017 to 0.93 y in 2020, a 70% plunge. ## 2021-02-02 # Finished Luhrmann et al. (2021) This study tries to answer why some people are more likely than others to experience the presence of gods and spirits. In four studies, the authors, possibly among others, interviewed and surveyed local residents having faith (in charismatic evangelical Christianity or the local religion) and urban undergraduate students in four places: US, Ghana, Thailand and China. The study found that porosity and absorption played distinct roles in people’s spiritual experiences across different cultures and religions. Porosity is a cultural attribution that defines the boundary between “mind” and the “world”. People living in cultures of different levels of porosity might have different viewpoints on whether their mental experiences come from and have influences on the outside world. Absorption is a personal tendency to be immersed in their own minds. ## 2021-02-01 # Luhrmann, T. M., Weisman, K., Aulino, F., Brahinsky, J. D., Dulin, J. C., Dzokoto, V. A., … & Smith, R. E. (2021). Sensing the presence of gods and spirits across cultures and faiths. PNAS, 118(5). PP. 1-4 # 2021-01 # ## 2021-01-31 # McDermott, A. (2021). Science and Culture: At the nexus of music and medicine, some see treatments for disease . PNAS, 118(4). Music seems to have medical benefits. Musical treatment might alleviate pain, and help delirium patients and those suffering from Parkinson’s. Early studies of music therapy were not necessarily “scientific” in the sense that many of them were not properly blind trials. Right now, there are some ongoing studies funded by NIH. ## 2021-01-30 # 1. Skimmed through the rest of Fienberg (2006) . Too difficult for me to grasp now. 2. Should You Go To Grad School? by Duncan Watts 1/4 on 2021-01-30 (completed on 2021-01-31, but not at the APAD time period) ## 2021-01-29 # 1. Finished West et al. (2013) Representation of women increased in general: in published academic papers from 1665-1989, 15.1% of the authors were women. This number increased to 27.2% for papers from 1990-2012. Figure 3 shows that the percentage of women as first authors increased but they are much less likely than men to be the last authors. 1. Fienberg, S. E. (2006). When did Bayesian inference become” Bayesian”? . Bayesian analysis, 1(1), 1-40. PP. 1-2 ## 2021-01-28 # 1. Zhao, Z. D., Yang, Z., Zhang, Z., Zhou, T., Huang, Z. G., & Lai, Y. C. (2013). Emergence of scaling in human-interest dynamics . Scientific reports, 3(1), 1-7. I skimmed through it. 1. West, J. D., Jacquet, J., King, M. M., Correll, S. J., & Bergstrom, C. T. (2013). The role of gender in scholarly authorship . PloS one, 8(7), e66212. PP. 1-3 ## 2021-01-27 # Dodds, P. S., Muhamad, R., & Watts, D. J. (2003). An experimental study of search in global social networks. Science, 301(5634), 827-829. • 98,847 people registered to participate in a “global search” task. They were to reach the person assigned to them via email. To do so, they were asked to relay the message to people they think are “closer” to their target. Receivers of the relayed message were asked to do the same thing. 18 targets were from the US, Estonia, Indian, Australia and Norway. • 61,168 people from 166 countries relayed the message, generating 24,163 message chains. More than half of these people were middle class and college educated North Americans. (Background information: the study was published in 2003.) • Two reasons stood out when people were asked why they sent a message to specific recipients: geographical proximity, and similar occupation. • Hubs were not important for the success of searches. • Weak ties are important. • It is important that people have incentives to proceed or confidence in the search; otherwise, it’s difficult to reach the target. ## 2021-01-26 (Completed on 2021-01-27) # Finished Zhang et al. (2006) E-learning with non-interactive videos did not allow students to score higher or to experience more satisfaction than that without video at all. This means that to make the best of e-learning, interactivity in videos is necessary. Examples of interactivity: the ability to see slides alongside instructional videos; to control the scroll bar of videos; to take notes while watching the video; keyword search of the slides; etc. ## 2021-01-25 # 1. Finished Mei (2014) 2. Ross, J., Irani, L., Silberman, M. S., Zaldivar, A., & Tomlinson, B. (2010). Who are the crowdworkers? Shifting demographics in Mechanical Turk . In CHI'10 extended abstracts on Human factors in computing systems (pp. 2863-2872). 3. Zhang, D., Zhou, L., Briggs, R. O., & Nunamaker Jr, J. F. (2006). Instructional video in e-learning: Assessing the impact of interactive video on learning effectiveness . Information & management, 43(1), 15-27. PP. 1-4 ## 2021-01-24 # 1. Finished Miao et al. (draft) 2. Mei, H. (2014). Women’s property within the structure of marriage in the Neo-Babylonian Period . Journal of Sino-Western Communications, 6(2), 153. PP. 1-8 ## 2021-01-23 # Miao, L., Murray, D., Jung, W-S., Lariviere, V., Sugimoto, C., & Ahn, Y-Y. (draft). The Universal structure of national scientific development • Compared to 1973-1977, countries in the world during 2013-2017 are more specialized in one of three research clusters (Natural, Physical, and Societal). This means that most countries' scientific research is becoming less diversified. • The scientific diversity of a country highly correlates with its GDP. • Scientific diversity is very predictive of future economic growth of a country; more so than the Economic Complexity Index (ECI). PP. 1-18. ## 2021-01-22 # Finished Alessandretti et al. (2020) The authors proposed a model: “container model”. This model, if given the trajectories of an individual, can infer the hierarchical levels of the mobility traces, and the size of each level. Based on the mobility traces of more than 700K people, the authors found that the mobility of these people has four hierarchical levels. This means that day-to-day human mobility is not scale-free. However, when we aggregate displacements across containers (I do not fully understand what this means, though), human mobility is scale-free. Fig. 2 shows that container models do a much better job than other models in predicting mobility traces. I do have a question, and this is partly because this article is too technical for me: if we can infer the results in Fig. 2 from real data, what is the use of this container model, which has to feed on real data? ## 2021-01-21 # 1. Zhang, R. (2017). The stress-buffering effect of self-disclosure on Facebook: An examination of stressful life events, social support, and mental health among college students . Computers in Human Behavior, 75, 527-537. 560 surveyed undergraduate students in Hong Kong tend to open up when in stress and their self-disclosure on Facebook moderates the relationship between stress and mental health. Facebook disclosure seems to be positively related to decreased depression. 1. Alessandretti, L., Aslak, U., & Lehmann, S. (2020). The scales of human mobility . Nature, 587(7834), 402-407. PP. 1-2 ## 2021-01-20 # Piwowar, H., Priem, J., Larivière, V., Alperin, J. P., Matthias, L., Norlander, B., … & Haustein, S. (2018). The state of OA: a large-scale analysis of the prevalence and impact of Open Access articles . PeerJ, 6, e4375. • What is the share of open access journal articles Around 27.9% of all articles with a DOI. Recent articles are more likely to be open access (See Fig. 2B ). • Open access articles received 18% more citations than if it were not open access. This is only correlation, not causation. It might well be that authors only made their most impactful work open access. ## 2021-01-19 # 1. Skimmed through the rest of Ogden et al. (2014) Key findings: Obesity has been prevalent among American youths and adults from 2003 to 2012. No significant changes in the prevalence were found. This means that obesity is still a problem for the United States. 1. Blumenstock, J., Cadamuro, G., & On, R. (2015). Predicting poverty and wealth from mobile phone metadata. Science, 350(6264), 1073-1076. The authors got access to detailed cell records data of 1.5 million subscribers in Rwanda. They did a phone survey involving 856 individuals, who were among the 1.5 million subscribers. The authors built a model that can predict an individual’s wealth based on these 856 individuals' answers and their cell records. The authors used this model to predict the wealth of Rwanda by district and found that it very well matched the actual wealth index as computed from a national survey of over 10k households in Rwanda. The authors argued that administering a national survey is very slow and costly. Predicting wealth from cell phone data is fast and relatively cheap. This method is accessible because a considerable number of people in developing countries are using cell phones now. ## 2021-01-18 # 1. Finished Bail et al. (2018) 2. Ogden, C. L., Carroll, M. D., Kit, B. K., & Flegal, K. M. (2014). Prevalence of childhood and adult obesity in the United States, 2011-2012 . Jama, 311(8), 806-814. PP. 1-4 ## 2021-01-17 # Bail, C. A., Argyle, L. P., Brown, T. W., Bumpus, J. P., Chen, H., Hunzaker, M. F., … & Volfovsky, A. (2018). Exposure to opposing views on social media can increase political polarization. PNAS, 115(37), 9216-9221. Does exposure to social media increase or decrease belief polarization? This study tries to answer this question via a field experiment lasting for one and a half months on Twitter. The authors hired a company to recruit Twitter users who identify themselves as Republicans and Democrats. 1652 people participated in the study (901 Democrats and 751 Republicans). Later, participants were invited to follow a Twitter bot that would retweet 24 messages/day for a month. The bot was designed to retweet counterattitudinal messages to the participants, i.e., Democrats would be assigned to a conservative bot and Republicans to a liberal bot. Participants did not know what types of bots they were going to be assigned to. Among those invited to follow a bot, 64.9% of Democrats and 57.2% of Republicans accepted the invitation. Fig. 1 provides an excellent summary of the experiment procedure. Results: Treated Democrats became slightly more liberal but the effects were not statistically significant. Republicans became significantly more conservative posttreatment. PP. 1-5 ## 2021-01-16 (completed on 2021-01-17) # Finished Park et al. (2021) • Social networks of those following pages of Places of Worship, Community Amenities, Bars and Pubs, Indoor Recreation, and Performing Arts had two types of members: core members who are closely connected with each other, and other members who are loosely-connected. This is called a “core-periphery structure”. A network with this structure indicates that there are “regulars” who visit the place often. Non-regulars tend to be friends of a “regular”. • The fact that people do similar things in two places does not mean that the two social networks have more similarities. For example, the friendship network of Bars and Pubs is closer to that of Community Amenities than to that of Restaurants. ## 2021-01-15 (completed on 2021-01-16) # 1. Finished Fire, M., & Guestrin, C. (2019). 2. Park, J., State, B., Bhole, M., Bailey, M., & Ahn, Y-Y. (2021). People, Places, and Ties: Landscape of social places and their social network structures . arXiv. The authors treated Facebook Pages as physical “third places” and studied the network structure of these Pages' followers. They found networks of those following pages of Outdoor Recreation, Indoor Recreation, Restaurant, Parks and Monuments had many independent dyads and triads. This indicates that people visit these places in small groups and they tend to be existing friends. ## 2021-01-14 (completed on 2021-01-15) # Continued with Fire, M., & Guestrin, C. (2019). • The use of question or exclamation marks in paper titles is increasing (< 1% in 1950 to >3% in 2013). • The percentage of papers with authors listed in an alphabetical order more than halved, from 21.0% in 1950 to 43.5% in 2014. • Paper abstracts are getting longer, from a mean of 116.3 words in 1970 to 179.8 words in 2014. • Self-citation: both the number of self-citations and the percentage of papers containing self-citations increased. • The mean and median length of academic papers decreased: 14.4 pages in 1950, 10.1 in 1990, and 8.4 in 2014. • Early-career scientists now are publishing more papers but are less likely to be the first authors, compared to those decades ago. • Both the number of journals and the number of papers published per journal each year increased. • For papers in top journals, the mean career age of first and last authors and the percentage of returning authors increased. ### Discussions # • The majority of the observed changes mentioned above are correlated with more academic citations. Here, the two authors said that “These results support our hypothesis that the citation number has become a target.” I am a little bit doubtful of this conclusion. Although it seems true instinctively, it does call for casual data. I do not think they can reach this conclusion only by observing a correlation between the discussed trends and the number of citations. ## 2021-01-13 # Continued with Fire, M., & Guestrin, C. (2019). • Changes in scientific publication: 3. Author lists are getting longer, i.e., “hyperauthorship”, from a mean of 1.41 authors in 1900 to a mean fo 4.51 authors in 2014. 4. Paper titles lengths are increasing. 5. Reference lists are getting longer. Few papers had over 20 references back in 1960 but now it is common to have papers with more than 40 references. ### Material # As mentioned before, MAG (Microsoft Academic Graph) was used. The MAG dataset contains 120.7 million papers but many of them are news papers, comments, and responses. The two authors filtered these out, analyzing only around 22 million papers that have a DOI and at least five references. Only authors of these 22 million papers were analyzed. The authors of this paper also used the AMiner open academic graph dataset, which is relatively new. ### Results # • Table 1 clearly shows that papers' median citations after 5 years of publication vary considerably across different fields and subfields. • Fig.1 shows how the number of papers has been increasing. PP. 2-6 ## 2021-01-12 # Fire, M., & Guestrin, C. (2019). Over-optimization of academic publishing metrics: observing Goodhart’s Law in action . GigaScience, 8(6), giz053. • What have been increasing: • number of published papers yearly (from fewer than 1 million in 1980 to more than 7 million in 2014) • speed of sharing papers (researchers can share their studies via non-traditional channels such as preprint servers) • number of peer-reviewed journals • number of published researchers • What haven’t been changed for decades: measures of scientific success (number of publications, impact factor, h-index) • Goodhart’s Law: When a measure becomes a target, it ceases to be a good measure • Study Material: The two authors of this paper analyzed over 120 million published papers with 528 million references and 35 million authors, since the early 1800s. The data came from the Microsoft Academic Graph (MAG) dataset. • Study Purpose: Are researchers focusing on attaining success metrics rather than the quality of research? • Changes in scientific publication: 1. Popularity of preprint servers such as arXiv, bioRxiv, SSRN, and RePEc 2. Mega-journals that value scientific trustworthiness rather than novelty. Prime examples are PloS One and Scientific Reports. PP. 1-2. ## 2021-01-11 # Finished Yin et al. (2021) • Scientific papers that appeared in policy documents received on average 40 times more citations from other scientific papers than those not found in policy documents. This indicates that papers referenced in government policy documents were also well received and respected in the scientific community. • Although preprint servers (medRxiv, bioRxiv, and SSRN) released much more papers on the COVID-19, papers published in peer-reviewed journals appeared more frequently in policy documents. See Fig. 2.3. • Policy documents grounded in rigorous scientific findings received more citations from other policy documents. • National governments produced more policy documents than think tanks and intergovernmental organizations (IGOs), but they cited science the least. IGOs, especially WHO, used science the most. I like this paper. Short and practical. It uses inferential statistics sparingly, so its findings seem more robust to me. ## 2021-01-10 # 1. Finished Myers et al. (2020) 2. Yin, Y., Gao, J., Jones, B., & Wang, D. (2021). Coevolution of policy and science during the pandemic . Science • The growth of the share of COVID-19 policy documents mirrored that of total confirmed cases of COVID-19. • In the beginning, COVID-19 policy documents were mostly about science & health. Later, the share of topics related to science & health fell whereas that of those regarding the societal and economical impacts of the pandemic grew. This pattern remained for other types of policy documents. • 20% of all the scientific papers cited in COVID-19 policy documents were those uploaded or published in 2020. • COVID-19 policy documents cited biomedical literature in the beginning of the pandemic. Later, the share of papers in the fields of economy, society and others grew. PP. 1-2 ## 2021-01-09 # 1. Continued with Frank et al. (2019) To better predict the impact of AI on the labor market, we need better data collection which is detailed, reflects real-time changes in the market, and contains regional differences. 1. Myers, K. R., Tham, W. Y., Yin, Y., Cohodes, N., Thursby, J. G., Thursby, M. C., … & Wang, D. (2020). Unequal effects of the COVID-19 pandemic on scientists. Nature human behaviour, 4(9), 880-883. • On average, working hours of scientists dropped from 61h/w pre-pandemic to 54h post-pandemic (April 2020) • Bench scientists, such as those working on biochemistry, biology, chemistry, and chemical engineering, saw the biggest declines (around 30% - 40%) in research time. By contrast, mathematicians, statisticians, computer scientists, and economists, had the lowest decline in research time. • Female scientists with young children have much less time for research during the pandemic. PP. 1-3 ## 2020-01-08 # Frank, M. R., Autor, D., Bessen, J. E., Brynjolfsson, E., Cebrian, M., Deming, D. J., … & Rahwan, I. (2019). Toward understanding the impact of artificial intelligence on labor. PNAS, 116(14), 6531-6539. People have always had concerns over the negative effects of automation and machines, from Plato worrying about writing displacing memory, to Wassily Leontief, winner of the Nobel Prize in Economics of 1973 had the concern that machines will displace human labor. PP. 1-3 ## 2021-01-07 # ### Aim of the study # To explain why success in the cultural market is different from average performance and yet very difficult to anticipate even for experts. ### Study design: # • 14,341 participants were recruited from a teen-interest website. They were randomly assigned to two conditions (independent vs. social influence) and rated 48 songs by different bands. • In the independent condition, participants were only shown the name of the bands and the songs. After listening to a song, they were asked to rate it; the rating ranged from 1 to 5. After rating, they were given the choice to download the song, although they were not required to do so. • In the social influence condition, participants also saw how many times a song has been downloaded by previous participants. • There were eight “worlds” to which participants in the social influence condition were assigned to. Each world is parallel to each other, meaning that the number of downloads in one world does not affect other worlds. • The authors of this paper conducted two experiments. • In experiment 1, 48 songs were presented in a $16 \times 3$ grid where the order of the songs was random. In experiment 1, participants in the social condition saw the number of downloads along with the name of the band whereas participants in the independent condition did not see this information. • In experiment 2, songs were presented in a single column. For participants in the social condition in experiment 2, the songs were shown in descending order of current downloaded counts, whereas the order was random for participants in the independent condition in experiment 2. • Why did the authors conduct two experiments? Through this design, in each experiment, the authors can see the effects of social influence on the success of each song. Furthermore, they can see the effect of increased “strength” of related information signals (i.e., downloaded numbers) by comparing the results of two experiments (for the social influence condition). ### Results # Social influence, i.e., information about the choice of others, contributed to both inequality and unpredictability of the songs. • Fig. 1 shows that there is more inequality when social influence is present. This is because, as we can see, the dark bars are consistently taller than the light bars. It also shows that an increased level of social influence leads to increased level of inequality. • Fig. 2 shows that as social influence increases, it becomes more unpredictable to tell whether a song turns out to be good or bad. ## 2021-01-06 # 1. Finished Gelman, A., & Loken, E. (2013). 2. Salganik, M. J., Dodds, P. S., & Watts, D. J. (2006). Experimental study of inequality and unpredictability in an artificial cultural market. science, 311(5762), 854-856. Finished reading but need to re-read it to recap the main steps & findings. ## 2021-01-05 # Continued with Gelman, A., & Loken, E. (2013). The researchers are not trying multiple tests to see which has the best p-value; rather, they are using their scientific common sense to formulate their hypotheses in a reasonable way, given the data they have. The mistake is in thinking that, if the particular path that was chosen yields statistical significance, that this is strong evidence in favor of the hypothesis. … The result remains, as we have written elsewhere, a sort of machine for producing and publicizing random patterns. PP. 5-13 ## 2021-01-04 # 1. Finished Lu et al. (2016) • In individualism-oriented countries, expressing happiness is encouraged whereas expressing sadness is not. It would be interesting to explore whether emojis can be leveraged to predict public opinions and sentiment of a country. • Brazilian users have similar emoji usage patterns to those in South America, whereas they differ from users in Portugal even if the two nations speak the same language. …whether emojis are really consistent with sentiments presented in texts. 1. Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time . Department of Statistics, Columbia University. PP. 1-5 ## 2021-01-03 # Continued with Lu et al. (2016) • Top 20 emojis are related to face, heart and hand, which implies that facial expressions and body signals are the most important when people express themselves through emojis. • The frequency of emoji usage has a power-law distribution. 😂 is by far the most popular emoji, accounting for 15.4% of the total emoji usage. • France stands out because 1) 19.8% of all messages sent using the Kika keyboard by French users contained at least one emoji; and 2) the most commonly used emoji is ❤️‍. How romantic! PP. 2-9 ## 2021-01-02 # 1. Finished Palchykov et al. (2012). I really like this study. The methodology is clear, visualizations informative, and conclusions easy to understand. 1. Lu, X., Ai, W., Liu, X., Li, Q., Wang, N., Huang, G., & Mei, Q. (2016, September). Learning from the ubiquitous language: an empirical analysis of emoji usage of smartphone users. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (pp. 770-780). PP. 1-2 ## 2021-01-01 # Palchykov, V., Kaski, K., Kertész, J., Barabási, A. L., & Dunbar, R. I. (2012). Sex differences in intimate relationships . Scientific reports, 2(1), 1-5. PP. 1-3 # 2020-12 # ## 2020-12-31 # Shneiderman, B. (2018). Twin-Win Model: A human-centered approach to research success. PNAS, 115(50), 12590-12594. • Some research questions are more useful than others. Research that solves a real-world problem has more impacts and should be encouraged. Good research should lead to both new knowledge and solutions to societal problems. … researchers need to work with professionals who have authentic problems. ## 2020-12-30 # Finished 汪小帆. (2020) ## 2020-12-29 # 1. Continued with Kraemer et al. (2020) • Is it possible that the above mentioned epidemiological patterns were due to increasing testing capacity rather than travel restrictions? The authors of the paper introduced a binary variable of testing capacity, whose value was “low” before 2020-01-20, the date when COVID-19 was categorized as a class B notifiable disease, and “high” after 2020-01-20. Compared to the naive model (see the end of p.1 of the paper for detail regarding this model), inclusion of human mobility data from Wuhan alone led to improvements in the model’s prediction for 12 provinces (among 27 provinces that reported cases through 2020-02-06). In 10 other provinces, both testing capacity and human mobility from Wuhan improved prediction. Only for Hunan did testing alone contributed the most to the model’s prediction. Therefore, the authors concluded that although testing capacity is important, in the early stage of the epidemic, Wuhan lockdown was the most important “driver of spread” (p. 3). I admire this work. 1. 汪小帆. (2020). 无标度网络研究纷争: 回顾与评述 . 电子科技大学学报, 49(4), 499-510. PP. 1-3 ## 2020-12-28 # Continued with Kraemer et al. (2020) • The volume and frequency of human movement from Wuhan to other places in China predicted the size of the early epidemic in other provinces. • After 2020-02-01, daily case counts became less correlated with human movement from Wuhan. This indicates that variability among different places in daily case counts was more likely due to factors other than human mobility from Wuhan before the Wuhan lockdown. This also indicates that travel restrictions are important in the early phase of epidemic control, but later, the importance of other local mitigation methods increased. • From 2020-01-09 to 2020-01-22, variation in the epidemic’s growth rate in provinces outside of Hubei was almost entirely explained by human mobility from Wuhan. After drastic control measures were taken across China, growth rates became negatively correlated with human movement from Wuhan; that is to say, provinces with more human mobility from Wuhan before the lockdown saw smaller growth rates. • Is it possible that the above mentioned epidemiological patterns were due to increasing testing capacity rather than travel restrictions? ## 2020-12-27 # 1. Finished Fried et al. (2020) People now are able to rotate their face, add makeup, turn closed eyes or mouth open, change hair style or wardrobe, change age, and produce synthetic videos. These methods allow people to experiment changing appearances, but might have negative effects such as increasing body dissatisfaction or making falsified content more prevalent on the Internet. 1. Kraemer et al. (2020). The effect of human mobility and control measures on the COVID-19 epidemic in China. Science, 368(6490), 493-497. P. 1 ## 2020-12-26 # Fried, O., Jacobs, J., Finkelstein, A., & Agrawala, M. (2020). Editing self-image. Communications of the ACM, 63(3), 70-79. PP. 1-6 ## 2020-12-25 # 1. Finished Sinatra et al. (2016) 2. Wang, D., Song, C., & Barabási, A. L. (2013). Quantifying long-term scientific impact. Science, 342(6154), 127-132. I skimmed through it. ## 2020-12-24 # Continued with Sinatra et al. (2016) The Q-model corrected the two shortcomings of the R-model. According to the Q-model: • A paper’s potential impact $p_\alpha$ is independent of a scientist’s productivity $N_i$ and parameter $Q_i$, which means that there is luck behind the impact. • High Q is only slightly correlated with higher N. Then the question is, what the hidden parameter of Q is, and what it indicates? According to the authors, Q is a constant ability to turn a project with randomly picked impact into high-impact. For each scientist, this ability is a constant (for 76% of all scientists); it does not grow with the development of the career stage. PP. 5-7 ## 2020-12-23 # Continued with Sinatra et al. (2016) So, what is the role of a scientist’s ability in impact generation? The authors first created a Random-impact model (R-model). This model assumes that no matter for whom, the impact of each scientific paper is randomly chosen from the same impact distribution $P(c_{10})$. Then, the only difference scientists is productivity $N$: how many papers she gets published in her career. This model accurately predicts the cumulative function of $P(\ge N^*/N)$ (Fig. 2E). However, this model has two problems: 1. If each paper’s impact is randomly selected from a universal distribution of $P(c_{10})$, then a scientist with a higher $N$ will be more likely to have a higher $c^*_{10}$, i.e., the citation of the highest impact paper. However, the R-model fails to predict this. See Fig. 3C. 2. Scientists with a higher average impact without $c^*_{10}$ will also score higher on $c^*_{10}$. That is to say, high impact papers are more likely to be produced by a scientist with a constantly high impact. Again, the R-model fails to predict this. See Fig. 3D. The authors then proposed an alternative model, Q-model. This model assumes that (1) as in the R-model, each paper’s impact is randomly selected from a universal distribution of $P(c_{10})$, and (2) each scientist $i$ has a unique value $Q_i$ that modulates impact. $Q_i$ is a constant throughout a scientist’s career. The impact of a paper published by a scientist $i$ is the product of $Q_i$, and $p_\alpha$ randomly drawn from the distribution $P(p)$ which is the same for all scientists . PP. 3-4 ## 2020-12-22 # Continued with Sinatra et al. (2016) • Fig. 2E: The random impact rule is further confirmed by Fig. 2E. This figure captures the cumulative distribution of $P(\ge N^*/N)$, where $N^*/N$ denotes when the highest-impact paper of a scientist occurs. The fact that it is a straight line indicates that the highest-impact factor can appear anytime within a scientist’s career. • Fig. 2A: We can clearly see the differences in impacts across three levels impact scientists. What’s the reason behind it: increasing productivity or increasing creativity? The authors again randomized the papers' impact while keeping scientists' productivity unchanged. The results for both high- and low- impact scientists remained unchanged. Therefore, differences in scientists can be better explained by productivity (see Fig. 1E) P. 3 ## 2020-12-21 # Continued with Sinatra et al. (2016) • A first look at Fig. 2D might tell us that the chance to publish the highest-impact paper drops with age. However, when the impact of papers was randomized, while keeping a scientist’s productivity each year unchanged, the distribution of the timing of $t^*$ remains almost the same. This is a very important finding, as it means that a scientist can produce her highest impact paper anytime in her career. P.3 I am not sure whether it’s only the order of a scientist’s papers that was randomized or that the order remains unchanged but papers' $c_{10}$ were randomized. The authors did not make this point crystal clear. ## 2020-12-20 # Sinatra, R., Wang, D., Deville, P., Song, C., & Barabási, A. L. (2016). Quantifying the evolution of individual scientific impact. Science, 354(6312). • Productivity and impact are metrics to gauge a scientist’s performance. • Fig. 1B: Only 5% of all the scientists analyzed have at least one paper that received 200 or more citations after its publication. • Fig. 1C and 1E: High impact scientists are also much more productive. Medium impact scientists started with more publications than the other two groups (high and low impact). In the first three years following the first publication, medium impact scientists have more publications than the other two groups, but high impact scientists quickly catch up: it took ten more years for the medium impact scientists to have the same number of publications as high impact scientists, and the figure for low impact scientists is 40 more years. • For a scientist, the timing of the highest-impact paper is truly uniform within the career, meaning that he or she can do ground-breaking work anytime within his or her academic career. P. 2 ## 2020-12-19 # Zha, Y., Zhou, T., & Zhou, C. (2016). Unfolding large-scale online collaborative human dynamics. PNAS, 113(51), 14627-14632. This study introduces a model that precisely fits the update history of hundreds of Wikipedia articles, which follows a double-power-law distribution. The model is based on Wikipedia but it might depict other forms of human collaborative activities that involve initiations and responses, such as communications via short messages and emails. So What? With this model, we can better detect abnormal activities in online collaborative systems. ## 2020-12-18 # 1. Spiegelhalter, D. J. (2014). The future lies in uncertainty . Science, 345(6194), 264-265. 2. McNutt, M. (2014). Reproducibility . Science. ## 2020-12-17 # Finished Singh et al. (2020) One point worth noticing is that the negative effect of susceptibility to believing in false rumors on vaccine acceptance is stronger than the positive effect of exposure to fact-checked vaccination-related information. ## 2020-12-16 # Continued with Singh et al. (2020) ### Research question # Whether (1) exposure to false rumors, (2) exposure to these false rumors' fact-checks, and (3) the perceived believability of each rumor (“believability”) are related to the willingness to get vaccinated (“vaccine acceptance”). ### Methods # From 2020-06-18 to 2020-07-13, the study survey was promoted in five languages (English, Spanish, French, Portuguese, and Arabic) on Facebook Advertising Platform. More than 44k Facebook users from 152 countries did the survey. The authors discarded (1) incomplete and duplicated responses, and (2) responses from countries with fewer than 30 respondents. The final dataset is over 805k responses from over 18k people in 40 countries. The survey was quite simple, as it only comprised of three questions: 1. Have you seen or heard this information in the past month? 2. Have you ever seen an official source confirming or denying this claim? 3. How believable does this information seem to you? At the end of the survey, the respondents were also asked to provide demographic information, and to indicate the extent they saw the coronavirus as a threat (“perceived threat”). ### Results # 1. Exposure to misinformation, and perceived threat are positively related to believability. 2. Exposure to misinformation alone is not strongly correlated with vaccine acceptance. However, the believability of false information is negatively correlated with vaccine acceptance. 3. Exposure to vaccine-related misinformation is positively correlated with vaccination hesitancy, and the believability of false vaccine-related rumors. Exposure to fact-checked vaccination-related information is positively correlated with vaccine acceptance. 4. Exposure, believability, and fact-checking of other types of false rumors are not correlated with vaccine acceptance. 5. Perceived threat is positively correlated with vaccine acceptance. P. 4 ## 2020-12-15 # 1. Finished Peterson et al. (2011) Key takeaway: If you want to have a successful career that is long, for example, being able to publish in top journals for many times rather than just once, it’s important that you make progress in the beginning of your career. 1. Singh et al. (2020). COVID-19 Misinformation, Believability, and Vaccine Acceptance Over 40 Countries. Preprint. PP. 1-3 ## 2020-12-14 # Continued with Peterson et al. (2011) PP. 2-4 ## 2020-12-13 # 1. Finished Evans (2008) Major findings: 1) Even though more journal articles published long ago became available online, scientists tend to cite more recent papers; 2) Even though more articles are becoming online, fewer journals and articles are being cited, and citations become concentrated on fewer journals and articles. Evans (2008) explains that this might be because scholars find it easier to locate prevailing opinions if they search online. Journals and articles that scholars might skim in the print age now are overlooked, pushing the citation to newer and fewer articles. This is alarming because it indicates that as online journals become more available, scientific studies are building upon fewer, rather than more, ideas. 1. Petersen, A. M., Jung, W. S., Yang, J. S., & Stanley, H. E. (2011). Quantitative and empirical demonstration of the Matthew effect in a study of career longevity. PNAS, 108(1), 18-23. PP. 1-2 ## 2020-12-12 # 1. Finished Wu et al. (2019) The title of this paper explains the main finding. One point worth noticing is that, as the authors mentioned at the end of the paper, fundings from the government for individuals or small teams do not enable smaller teams to produce disruptive results. This is because small teams funded by the government don’t want to take the risk of entering uncharted areas. This paper explains my intuition: more and more scientists, and more and more money invested in research, do not necessarily mean more ground-breaking results. Also, I don’t believe large teams should always dominate the science community. I also don’t think it’s a good idea to always promote publications with a massive number of individuals. Teamwork is great, but does not always produce disruptive work. 1. Evans, J. A. (2008). Electronic publication and the narrowing of science and scholarship. Science, 321(5887), 395-399. PP. 1-5 ## 2020-12-11 # 1. Finished Simmons et al. (2011) 2. Wu, L., Wang, D., & Evans, J. A. (2019). Large teams develop and small teams disrupt science and technology. Nature, 566(7744), 378-382. PP. 1-4 ## 2020-12-10 # Somers, J. (2018). The scientific paper is obsolete . The Atlantic, 4. ## 2020-12-09 # 1. Ioannidis, J. P. (2005). Why most published research findings are false. PLoS medicine, 2(8), e124. I skimmed through it. Couldn’t understand it. 1. Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological science, 22(11), 1359-1366. PP. 1-6 ## 2020-12-08 # Bohannon, J. (2016). Who’s downloading pirated papers? Everyone. Science. Retrieved from https://www.sciencemag.org/news/2016/04/whos-downloading-pirated-papers-everyone ## 2020-12-07 # 1. Finihsed Fernandes et al. (2018) Key points: the title explains everything. I love the title. 1. Grabowicz, P. A., Ramasco, J. J., Moro, E., Pujol, J. M., & Eguiluz, V. M. (2012). Social features of online networks: The strength of intermediary ties in online social media. PloS one, 7(1), e29358. ## 2020-12-06 # Fernandes, M., Walls, L., Munson, S., Hullman, J., & Kay, M. (2018, April). Uncertainty displays using quantile dotplots or CDFs improve transit decision-making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (pp. 1-12). PP. 1-6 ## 2020-12-05 # Ai, W., Chen, R., Chen, Y., Mei, Q., & Phillips, W. (2016). Recommending teams promotes prosocial lending in online microfinance . PNAS, 113(52), 14944-14948. Key questions: 1. Which recommendations increase people to join online teams on a crowdlending platform? 2. After people join a team, will they lend more money? Key findings: 1. Recommendation emails did increase the probability of one person to join a team; 2. People receiving an recommendation email containing location similarity explanation are more likely to join a team; 3. People are more likely to join teams that have similar location and higher status; 4. After joining a team, members lend more money in the subsequent week. ## 2020-12-04 # 1. Meier, A., Gilbert, A., Börner, S., & Possler, D. (2020). Instagram Inspiration: How Upward Comparison on Social Network Sites Can Contribute to Well-Being. Journal of Communication, 70(5), 721-743. Skimmed through it. Didn’t like it. 1. Hilbert, M., & Darmon, D. (2020). Large-Scale Communication is More Complex and Unpredictable with Automated Bots. Journal of Communication, 70(5), 670-692. Skimmed through it. 1. Danescu-Niculescu-Mizil, C., Cheng, J., Kleinberg, J., & Lee, L. (2012, July). You had me at hello: How phrasing affects memorability. In Proceedings of the ACL. … memorable quotes consist of unusual word sequences built on common syntactic scaffolding. PP. 1-8 Forgot about this paper (You had me at hello) on 2020-12-05. Finished it on 2020-12-19. ## 2020-12-03 # Finished Hosseinmardi et al. (2020) Main findings: 1. Consumption of radical political news content on YouTube is smaller but more engaging than other contents, and its popularity has been rising. 2. This consumption reflects the broader social trend, and perhaps is not due to the recommendation algorithms by YouTube. ## 2020-12-02 # 1. Service, R. ‘The game has changed.’ AI triumphs at solving protein structures. From Sciencemag . 2. Hosseinmardi, H., Ghasemian, A., Clauset, A., Rothschild, D. M., Mobius, M., & Watts, D. J. (2020). Evaluating the scale, growth, and origins of right-wing echo chambers on YouTube . arXiv preprint arXiv:2011.12843. PP. 1-2 ## 2020-12-01 (completed on 2020-12-02) # Yang, T., Majo-Vazquez, S., Nielsen, R. K., & González-Bailón, S. (2020). Exposure to news grows less fragmented with an increase in mobile access. From https://www.pnas.org/content/117/46/28678 . Key question: technologies give people more choices regarding consuming news. How does this affect the overall news consumption pattern? Do audiences become more fragmented (i.e., “audiences disperse among the higher number of choices”)? Conclusions: 1. Selective exposure exists but it does not grow in magnitude amongst increasing choices of news content. 2. The pattern and effect of consuming news on desktop are different than that of consuming news on multiplatform, with the later reaching increasingly larger audiences, attracting more time spent on news consumption, and making audiences less fragmented. 3. More than half of the US population access little to no news. These people might also be susceptible to misinformation. # 2020-11 # ## 2020-11-30 # 1. Finished Huberman et al. (2008) • The number of friends, rather than that of followers, more accurately reflects someone’s Twitter activity. • A link between two users on social media like Twitter does not imply there is interaction in them. 1. Forbush, E., & Foucault-Welles, B. (2016). Social media use and adaptation among Chinese students beginning to study in the United States. International Journal of Intercultural Relations, 50, 1-12. I skimmed through it. ## 2020-11-29 # 1. Finished Donoho (2015) In the future science, a scientific paper is not the scholarship itself, but the “advertising” of the work. The scholarship will be data and codes, which, of course, are “universally citable and programmatically retrievable”. 1. Huberman, B. A., Romero, D. M., & Wu, F. (2008). Social networks that matter: Twitter under the microscope. arXiv preprint arXiv:0812.1045. Hypothesis: the number of contacts (followers and friends) is positively related to the intensity of Twitter activity. PP. 1-5 ## 2020-11-28 (Completed on 2020-11-29) # Continued with Donoho (2015). PP. 10-18 ## 2020-11-27 # 1. Blumenstock, J. E. (2008, April). Size matters: word count as a measure of quality on wikipedia. In Proceedings of the 17th international conference on World Wide Web (pp. 1095-1096). This study is really fun. 1. David Donoho (2015). 50 years of Data Science . PP. 1-9 ## 2020-11-26 # 1. Finished Gilbert & Karahalios (2009) 2. Salganik et al (2020). Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences, 117(15), 8398-8403. Study design: Data about family and children were available only for waves 1-5 (from child birth to age 9) but not available yet for wave 6 (child age 15). Researchers tried to predict the results (child GPA, child grit, household eviction, household material hardship, caregiver layoff, and caregiver participation in job training) of wave 6 based on the whole data for waves 1-5 and half of the data for wave 6. Result: Scientists leveraging complicated machine learning algorithms could not predict those outcomes correctly. If a score of 1 means perfectly accurate, and 0 not accurate at all, the best predictions got a score of 0.2 for material hardship and child GPA, and only 0.05 for the other four outcomes. Also note that a linear regression or logistic regression model with only four variables chosen by domain experts were only slightly worse than the best submission, and were much better than most of the other submissions. … the submissions were much better than predicting each other than predicting the truth. ## 2020-11-25 # Continued with Gilbert & Karahalios (2009) ## 2020-11-24 # Gilbert, E., & Karahalios, K. (2009, April). Predicting tie strength with social media. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 211-220). ### Key question: # 1. Can dimensions of tie strength predict tie strength? How? 2. Limitations of this model? ### Methods # Participants on Facebook answered five tie strength questions for as many friends as possible within half an hour, ending up with 2184 Facebook friendships rated. The five questions are: How strong is the relationship, how comfortable to ask for a loan, how helpful if looking for a job, how upset if unfriended, and how important to bring a friend to a new channel if need be. (See Table 2) Independent variables: The researchers then used 74 Facebook variables (such as Wall and words in the inbox) to predict intensity, intimacy, duration, reciprocal services, structure, emotional support, and social distance. (See Table 1) Participants also answered questions regarding their demographic and Facebook usage, and their friends' Facebook usage. Dependent variables: answers from participants for the five questions mentioned above. PP. 1-4 ## 2020-11-23 # 1. Finished Weng et al. (2018) Links between people can be categorized into two types based on reciprocity: social links and informational links. Second result: Weak ties attracted attention as much as, or even more than strong ties. … people interact along strong ties due to their social relationships, while looking for novel information through weak ties. 1. Finished Lilian Weng’s article of Attention? Attention! . Couldn’t understand it. ## 2020-11-22 # Weng, L., Karsai, M., Perra, N., Menczer, F., & Flammini, A. (2018). Attention on weak ties in social and communication networks. In Complex Spreading Phenomena in Social Systems (pp. 213-228). Springer, Cham. Many papers on the strength of weak ties do not answer the question of whether weak ties carry important or novel information. To answer this question, we can test whether users pay more attention to information travelled through a weak tie. But how can we measure “attention”? A reasonable proxy is the number of friends a user has in a social network. Result: 1. Strong ties carry more traffic, confirming that people communicate more with close friends. PP. 1-12 ## 2020-11-21 # Centola, D. (2019). Influential networks. Nature human behaviour, 3(7), 664-665. Ordinary people, instead of influencers (“hubs), are more likely to propagate complex contagions because they offer more social reinforcement. ## 2020-11-20 # Guilbeault, D., Becker, J., & Centola, D. (2018). Social learning and partisan bias in the interpretation of climate trends . Proceedings of the National Academy of Sciences, 115(39), 9714-9719. ### Central question # Does information exchange in bipartisan communication networks increase or decrease partisan bias? ### Literature # What is the drawback of previous studies: people had conversations, so that researchers could not distinguish between the effect of partisan priming, and that of opposing views. ### Design & Procedure # Four groups: A control group in which participants had the same political ideology; Group 2, 3, and 4 are all structured social networks with an equal number of conservatives and liberals. Group 2 were only shown the average of their 4 network neighbors, without any other information exhibited. Group 3 were shown the average of their neighbors plus the logos of political parties. Group 4 were shown the average as well, along with the neighbors political identity. Each group provided estimates for three times. For the first round, each member estimated independently. In Round 2 and Round 3, the control group revised their answers independently. Group 2 revised their answers while being exposed to their neighbors' average answer. Group 3 revised their answer while being exposed to Republican and Democratic party logos below their neighbors' average estimate. Group 4 revised their response while being shown the usernames, political identification, and the average of each of their four neighbors, and the average of these four neighbors' answers. ### Results # Group 2: both liberals and conservatives improved their trend accuracy, with an elimination of partisan bias in interpreting climate change data. Even conservatives in this group predicted trends significant more accurately than liberals in the control condition. Group 3: there was no effect of social learning, and belief polarization in Round 1 was maintained. Group 4: trend accuracy improved but moderate belief polarization remained. (Belief polarization means that liberals significantly outperformed conservatives in predicting climate change trends.) • Exposure to logos of political parties had a stronger effect on decreasing the impact of social learning than exposure to neighbors' political identity. • Both conservatives and liberals improved their prediction accuracy thanks to information exchange in networks, even when exposed to their network neighbors' political identification. ### Robustness check # Can social learning reduce polarization in homogeneous networks (i.e., networks that are not bipartisan)? Robustness tests showed that the effect of social learning was reduced in politically homogeneous networks: by Round 3, trend accuracy of conservatives in these echo chambers did not differ significantly from conservatives in the control condition. Considering this result, instead of saying the effect of social learning was reduced, I think it’s more accurate to say it was removed. Another question is whether the effect of social learning remains if participants in Group 4 were shown individual answers rather than an average. Results showed that the effect of social learning was robust to exposure to individual responses. I am very puzzled by the result. It showed that the effect of social learning was eliminated in homogeneous networks, but the paper of the wisdom of partisan crowds showed exactly the opposite result . ### Conclusion & suggestion # It’s better to have political discussions in bipartisan networks without partisan cues. ## 2020-11-19 # Continue with Guilbeault, Becker, & Centola. (2018) • Complex contagions require a critical mass to start a large-scale cascade and critical mass is dependent on network topology, node degree distribution, and adoption thresholds. New directions: 1. Ecologies of complex contagions: how several contagions interact with each other within a network and across networks. The following were added on 2020-11-22 1. Heterogeneity of thresholds Thresholds of contagions vary. Different people and different activities may have different thresholds. 1. Homophily and diversity in diffusion Identity-based diversity means one’s neighbors have different characteristics. Structural diversity means one’s neighbors belong to different components of the network. The first kind of diversity reduces the spreading of complex contagions whereas the second one amplifies it. If one has too many friends, he or she might not be an ideal target of complex contagions. This is because complex contagions need multiple reinforcement. When you have too many friends, you receive fewer repeated exposure. Therefore, “clustered, homophilous networks” are conducive to complex contagions. • How do people infer global structure from their local interactions? • How do future social media networks facilitate (or reduce?) these inferences by giving people more (or less?) information about their broader ego network? ## 2020-11-18 # Continue with Guilbeault, Becker, & Centola. (2018) • When social networks get smaller, it becomes easier to spread for simple contagions but harder for complex contagions. • Research in complex contagions: health, innovation, social media, and politics • Peer characteristics, such as homophily and diversity, influence how likely behavior changes. • Diffusion of innovations is characterized by complex contagions. • Dynamics of adoption might be different from that of termination. • Which has more effects on the likelihood of spreading through social influence: the influence of the source person, or the quality of the contagion? PP. 7-14. ## 2020-11-17 # 1. Finished Becker, Porter, & Centola. (2019). The wisdom of partisan crowds . Proceedings of the National Academy of Sciences, 116(22), 10717-10722. Aim: to see whether there is “wisdom of crowds” in politically homogeneous networks. Experiment design: Participants were randomly assigned to two conditions: control condition vs social condition. They were asked to provide an answer to a question for three times (rounds): • Participants in the control condition provided the answer independently for three times. • Those in the social condition answered independently in Round 1. In Round 2, they were shown the average answer of four other participants connected to them in a social network and then updated their answer. In Round 3, they were shown the average of the updated answers of four other participants connected to them (same in Round 2) and provided a final answer to the question. • A network consists of 35 participants who share the same political orientation (either Democrats or Republicans). Participants in the network did not know that other people in the network had the same partisan preference as theirs. • The researchers tested four questions. Each question was answered by 3 network groups and 1 control group for each political party. Results of Experiment 1: Information exchange in homogeneous networks increased accuracy for both party members and decreased belief polarization. Individual learning (being able to edit their answers in Round 2 and 3) was not the reason for increased accuracy because the decrease in truth-centered mean (absolute distance from the mean) in the social group was significantly larger than that in the control group. Therefore, the change should be attributed to information from others. Another possible reason is that the increased accuracy for groups as a whole obscured the decreased accuracy at an individual level, for example, when the standard deviation of truth-centered mean in a group increased. Results showed that for social groups, the standard deviation of responses in Round 3 was significantly smaller than that in Round 1. This change did not occur in the control group, indicating that similarity within social groups increased. Replication study design has some differences from Experiment 1: 1. More controversial questions; 2. Participants were required to confirm their political preference before participating in the study; 3. The experiment interface included an image of an elephant and a donkey; 4. The answer from four other connected participants were accompanied by their political orientation; 5. Subjects knew that they were participating in a study related to “Politics Challenge”. Items 2 - 5 were partisan primes intended to “enhance the effects of partisan bias on social information processing” (p. 4). Results of the replication study: same as in Experiment 1, social learning increased participants' answer accuracy for both Democrats and Republicans. Participants within each group became more similar over time. So, … social learning is robust to partisan priming for both group-level improvement and individual improvement. But how about the difference between Democrats and Republicans? The above results showed “within group” changes but not between group changes. Results showed that between-group similarity also increased for participants in the social condition (37% for Experiment 1 and 46% for the replication study), which means that polarization decreased. To recap: social information exchange within homogeneous networks helped people make more accurate estimates. Similarity within and between groups increased, indicating that people within social groups got similar, and that polarization diminished. And this result withstood partisan priming. My question: Will the result stay the same if information exchange is not confined to numeric estimates? Why don’t we allow people to chat? Is it because of lack of technical support or that there is theoretical consideration against it? Conclusion: Homogeneous networks do not necessarily lead to polarization. In fact, polarization is decreased and accuracy increased. Then why do we still have polarized public opinions? This is because popular social media are centralized networks, which make influencers able to exert disproportionate effects on other people in the network. Future directions: • Any other reasons why “echo chambers” and polarization coexist in reality? • Is it possible to replicate this study in real-life networks? For example, in Facebook or Twitter, where information exchange is not limited to numeric estimates? • How could we eliminate, or at least reduce the effects of influencers in a network, if ever possible? 1. Guilbeault, Becker, & Centola. (2018). Complex contagions: A decade in review. In Complex spreading phenomena in social systems (pp. 3-25). Springer, Cham. PP. 1-7 ## 2020-11-16 (Edited on 2020-11-19 and 2020-11-20) # Continue with Becker, Porter, & Centola. (2019) ## 2020-11-15 # Becker, Porter, & Centola. (2019). The wisdom of partisan crowds. Proceedings of the National Academy of Sciences, 116(22), 10717-10722. ## 2020-11-14 # Continue with Popp, T. (2019) The experiment comparing the effects of two different network structures (clustered vs random) on the behavior spread was concluded in my earlier post . Another experiment involves an 11-week fitness initiative among 800 graduate students at Penn. The experiment consisted of four groups: 1. Group 1: Control group. Participants were allowed to sign up for fitness classes through an online portal. 2. Group 2: Same as the control group, but participants were also divided into groups based on their similarities. On the online portal, class attendance of anonymized “health buddies” was displayed. Communication with each was not possible. 3. Group 3: Access to the online portal + groups based on similarities + communication between health buddies 4. Group 4: Conditions in Group 3 with scores of other groups displayed on the portal. In Group 1 and 2, individuals completing the most classes were promised to get monetary rewards. In Group 3 and 4, health buddy groups completing the most classes got the monetary rewards. Results: 1). Exercise rate in group 2 & 4 were much higher; 2). Group 3 did worse than the control group. ## 2020-11-13 # Popp, T. (2019). The Virality Paradox. The Pennsylvania Gazette. Retrieved from https://ndg.asc.upenn.edu/wp-content/uploads/2019/05/Virality-Paradox.pdf • Even without digital tools to communicate with each other, rebel activities became more widespread in Syria. This is surprising because without telecommunication, Syrian rebels lost long ties that bridge groups far away from each other. That is, they had to rely on face-to-face communication to coordinate. How did it happen? When you get to understand that the way behaviors spread is different from that information diffuses, the answer will become clearer. • Information, messages, and ideas spread like an epidemic whereas human behaviors don’t. Contagions can be classified into two types: simple and complex. A single contact can start a simple contagion, but won’t do the same for complex contagions, which involve efforts and costs, and require confirmation or reinforcement from multiple sources. • Long ties are enough for simple contagions whereas complex contagions favor wide ties. How wide should a tie be for a behavior to spread varies. If reputation is at stake, the threshold will be higher. That’s where complex contagions are very different from simple ones. In simple contagions, hubs get infected early and then it spreads the infections to many others. In a complex contagion, however, hubs usually have reputation at stake, so they are less, rather than more, likely to get infected. ## 2020-11-12 # Finished Guilbeault & Centola. (2020) … allowing smokers and nonsmokers to exchange views while aware of each other’s smoking status effectively reduced bias both in their evaluation of health risks, and in their beliefs about each other’s capacity to accurately interpret scientific data about the health risks of tobacco use. An interesting finding in this study is that after interacting with each other in social networks (which was limited to numeric estimates in the study), smokers and nonsmokers did not differ significantly in their perceptions of smokers' ability to understand health information associated with smoking. This means that biases were reduced. ## 2020-11-11 # Continue with Guilbeault & Centola. (2020) Study design: 1,600 people were recruited via Amazon’s Mechanical Turk. There are 10 independent trials in the experiment. Each trial involves 160 participants who were randomly assigned to the following three groups: 1. Control group: 40 smokers / 40 non-smokers, so 80 people in each trail. 2. Anonymous network group: 40 people (20 smokers and 20 nonsmokers) were embedded into a random social network that is decentralized and anonymous. 3. Informative network group: 40 people (20 smokers and 20 nonsmokers) were put into a social network where they could see the usernames and the smoking status of their four network neighbors. Procedure: Participants were shown an anti-smoking advertisement and were asked to estimate the health risk of smoking by answering this question: How many people (in millions) are predicted to die from tobacco use in developed countries, in 2030? Participants were incentivized by monetary reward awarded based on the accuracy of their final answer. Changes in answers' accuracy were measured by the difference between Round 1 and Round 3. • Round 1: Participants in all groups provided the answer independently. • Round 2: Group 1 revised their estimates with independent reflection. Group 2 were shown the average response of their contacts and then revised their estimates. Group 3 were also shown the average response by their four contacts. They were also shown the usernames and the smoking status of their contacts. • Round 3: Same as in Round 2. Results: • In Round 1, both smokers and nonsmokers were equally inaccurate at estimating the health risk of smoking; • No significant improvement in estimated accuracy in the control group. • The decrease in estimate error in group 2 was significantly greater than both smoker and nonsmokers in the control group; • The decrease in estimated error in group 3 was significantly greater than group 2. Specifically, this decrease is ten times greater than both smoker and nonsmoker in the control group. ## 2020-11-10 # Guilbeault, D., & Centola, D. (2020). Networked collective intelligence improves dissemination of scientific information regarding smoking risks. Plos one, 15(2), e0227813. PP. 1-6 ## 2020-11-09 # Centola (2020). Why Social Media Makes Us More Polarized and How to Fix It. Retrieved from https://www.scientificamerican.com/article/why-social-media-makes-us-more-polarized-and-how-to-fix-it/ . The more equity in people’s social networks, the less biased and more informed groups will become–even when those groups start off with highly partisan opinions. • We believe that if we are put in a group consisting of like-minded people (so called “echo chambers”), we probably won’t develop ideas that are on the opposite side of the spectrum. However, two social media experiments found the opposite results. In a study, Democrats and Republicans were put into “echo chambers”, and discussed polarizing issues such as gun control, unemployment rate, and immigration. Both groups ended up moving toward a more moderate view of the topics. • In another study, smokers and nonsmokers estimated the risks of cigarette smoking. After the study, both groups had a more accurate understanding of the topic, and a higher opinion of the other group. • Social media of our time exacerbates rather than eradicates partisan bias, because it’s centralized, rather than egalitarian. In a centralized network, influencers filter or even block information. For example, if an influencer spreads a piece of wrong information, it might end up becoming an entrenched false belief in the whole community, whereas in an egalitarian network, each person has an equal say, and ideas are weighed by its own quality rather than the influence of the people behind them. ## 2020-11-08 # 1. Finished Salehi & Bernstein (2018) This paper is a little bit too long. I skimmed through the last 2/3 of it. 1. Ahn, Y. Y., Ahnert, S. E., Bagrow, J. P., & Barabási, A. L. (2011). Flavor network and the principles of food pairing. Scientific reports, 1, 196. Main Takeaway: North American and European cuisine tends to combine ingredients with shared flavor but East Asian dishes don’t. ## 2020-11-07 # Continue with Salehi & Bernstein (2018) • To boost cooperative work, intermix people, not ideas. PP. 1-10 ## 2020-11-06 # 1. Finished Ahn et al. (2007) The second half of the paper is difficult for me, so I skimmed through it. 1. Schich et al. (2014). A network framework of cultural history. Science, 345(6196), 558-562. I skimmed through it. 1. Salehi, N., & Bernstein, M. S. (2018). Hive: Collective design through network rotation. Proceedings of the ACM on Human-Computer Interaction, 2(CSCW), 1-26. P. 1. ## 2020-11-05 # Continue with Ahn et al. (2007) It’s insightful to examine the distribution of clustering coefficients of different degrees. The clustering coefficient of degree $ k $ is represented as $ C(k) $. Degree of separation is the mean distance between two nodes. It surprised me that Professor YY used red and green for visualizations, which is unfriendly for color blind people. PP. 4-9 ## 2020-11-04 [Completed on 2020-11-05] # Continue with Ahn et al. (2007) • Three network sampling methods: node sampling, link sampling, and snowball sampling. • Node sampling: select randomly several nodes, and links between these selected nodes are included in the sample; • Link sampling: similar to node sampling, select randomly a bunch of links, and nodes attached to these links are included in the sample; • Randomly select a seed node and do a breadth-first-search until the number of selected nodes reaches expectation. Only links between selected nodes are included in the sample. I need to brush up on Breadth-First-Search. Forgot its algorithm already. • Power-law degree distribution usually is plotted as a CCDF (complementary cumulative probability function). Yeah, I missed this point when I first learned power-law. • Clustering coefficient of a node: $$\frac{Number \ of \ existing \ links \ between \ its \ neighbors}{Number \ of \ all \ possible \ links \ between \ its \ neighbors}$$ It describes how well its neighbors are connected. The clustering coefficient of a network is the mean of all nodes' clustering coefficient. It stands for the probability of a link between two randomly selected nodes that share a neighbor. ## 2020-11-03 # 1. Finished Steegen et al. (2016) In a more complete analysis, the multiverse of data sets could be crossed with the multiverse of models to further reveal the multiverse of statistical results. This is so true. I have several thoughts about this point: First, it shows how arbitrary the choices in data processing and model picking are, and therefore, how arbitrary the statistical results might be. When I was doing research on selfies, I also had the same feeling. When I was doing the content analysis study comparing the differences in White women’s selfies on Twitter and Chinese women’s selfies on Weibo, I had to made so many arbitrary choices: whether to drop an item from a construct, whether to combine items, should I use a t test or nonparametric test, etc. Second, doing multiverse analysis reporting is very methodologically challenging. I cannot imagine a Master student after attending one statistics class doing a project involving more than 200 choice combinations. Finally, scholars will find it more difficult to cite others' studies. Right now, it’s fairly easy to cite because almost all research papers generate a certain result. With multiverse analysis, almost all studies will involve many uncertainties. This complicates how people interpret the statistical results. That said, statistics is about uncertainties. It’s certainly good to show these uncertainties. I think the science community, and the public, should be accustomed to seeing uncertainties in statistical results in the coming years. 1. Ahn, Y. Y., Han, S., Kwak, H., Moon, S., & Jeong, H. (2007, May). Analysis of topological characteristics of huge online social networking services. In Proceedings of the 16th international conference on World Wide Web (pp. 835-844). PP. 1-2 ## 2020-11-02 # Continue with Steegen et al. (2016) We suggest that, if several processing choices are defensible, researchers should perform a multiverse analysis instead of a single data set analysis. A multiverse analysis is a way to avoid or at least reduce the problem of selective reporting by making the fragility or robustness of the results transparent, and it helps the identification of the most consequential choices. Even when confronted with only one arbitrary data processing choice, researchers should be transparent about it and reveal the sensitivity of the result to this choice. Increasing transparency in reporting through a multiverse analysis is valuable, regardless of the inferential framework (frequentist or Bayesian), and regardless of the specific way uncertainty is quantified: a p value, an effect size, a confidence (Cumming, 2013) or credibility (Kruschke, 2010) interval, or a Bayes Factor (Morey & Rouder, 2011). The authors argued that “preregistration or blind analysis are not useful strategies for deflating the multiverse” (p. 709). I totally agree. I am not familiar with blind analysis, so I’ll just talk about preregistration. As the authors noted, even if the study is preregistered, the result is still just one of the many possible choice combinations, albeit preregistered made. Therefore, the results of a preregistered study are still arbitrary, if the research involves arbitrary, or “whimsical” choices in data construction. The authors also talked about “model multiverse” at the end of the article. Something I don’t understand yet in this paper: When participants are excluded based on reported or computed cycle length, we do not consider next menstrual onset based on computed or reported cycle length, respectively. When only one choice is clearly and unambiguously the most appropriate one, variation across this choice is uninformative. ## 2020-11-01 # Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11(5), 702-712. • Some common measures to solve the reproducibility crisis in social sciences: high power, adjusting the $\alpha$ level, focusing on estimation not on testing, using Bayesian statistics. • How can we increase transparency in research: pre-registration, sharing data & research materials. • I agree that there are so many choices to make when dealing with raw data. So the same raw data might end up becoming many different datasets ready for analysis if it was processed by many researchers. This is what multiverse is trying to do: to list all possible (and reasonable) datasets derived from the raw data, and show all possible statistical results. A multiverse analysis displays the stability or robustness of a finding, … across different options for all steps in data processing. (p. 703) PP. 702-707 # 2020-10 # ## 2020-10-31 # Fig. 1 shows the happiness distribution of words in each language, but how each word varies in their happiness score between languages. Google Translate is used. The result can be found in Fig. 2 . As can be seen, the order changed a little bit, but the overall pattern remained. Spanish is the “happiest”, and Chinese is the “saddest” (I highly doubt so, though). Another interesting question to ask is whether a word’s happiness score is associated with its frequency of use. As can be seen in Fig. 3 . It turns out they are not associated. ## 2020-10-30 # Dodds et al. (2015). Human language reveals a universal positivity bias . Proceedings of the National Academy of Sciences, 112(8), 2389-2394. PP.1-2 Purpose: To study the positivity of human language Material: 24 corpora of 10 languages, including Chinese (simplified), Korean and Arabic Measure: a word’s importance is measured by its frequency Procedure: 1. For each language, obtain the most frequently used words (around 10K) 2. Invite (and pay) native speakers to rate “how they felt in response to” (p.2390) each word on a 9-point Likert scale, with 1 representing the most negative or saddest, 5 neutral, and 9 the most positive or happiest. Each word receives 50 ratings, so there are 5 million human assessments in total. Results: 1. The result can be found in Fig. 1 . ## 2020-10-29 # Finished Kramer et al. (2014) You can see my summary of this paper in HTML or PDF Implication-2. Seeing fewer friends' positive posts led people to produce fewer positive words in their own posts, rather than the opposite. ### Drawbacks # • The effect size is quite small. ### Thought # As the Editorial Expression of Concern and Correction said, it is “a matter of concern” that what we see on social media is to such a large extent manipulated by tech giants. As the study found, the content we see has an effect on our well-being. Even if they don’t, users should be able to know what they are going through, rather than becoming a subject in an experiment we are ignorant of. ## 2020-10-28 [Completed on 2020-10-29] # Continue with Kramer et al. (2014) ### Measurements & Measures # • To test the hypotheses, how are negativity and positivity measured: The percentage of the words as either positive or negative produced by a person. • A check before running the experiment: all four groups did not differ in emotional expression in the week prior to the experiment. • Why using a weighted linear regression: It was described in the Study Design that the chance of a post being omitted is not fixed. However, an effect was found that when people see fewer posts (i.e., more omission), they in turn posted fewer words. Therefore, we need to account for this effect by assigning weights to people. Specifically, people having more omission were given a higher weight in the regression. See details on p. 8789. ### Results # Both H1 and H2 were supported. As can be seen in the figure , when negativity is reduced, people generate more positive words and fewer negative words, compared to the control group. The opposite pattern occurs when positivity is reduced. It shows that emotions expressed by our friends through online social networks influenced our own mood status. Some implications: 1. Direct interactions were not necessary for emotional contagion. ## 2020-10-27 # Continue with Kramer et al. (2014) ### Study design # • Why are two (separate) control conditions needed? Because the percentage (46.8%) of posts containing at least one positive word is much larger than that (22.4%) of posts containing at least one negative word. Suppose that for a person, 10% of his positive News Feed is omitted, and there is only one control group, what should be the corresponding percentage of a person’s random News Feed being omitted in this control group? I don’t know. Why? For example, if there are three people in experiment A (positivity reduction group), and their content reduction rate is 12%, 13%, and 14% respectively. Accordingly, we assume that the content reduction rate in the control group should be 12% times 46.8%, 13% times 46.8%, and 14% times 46.8%. No. Why? Because there is also experiment B, whose content reduction rate might be different than that of experiment A. Therefore, each experiment needs a separate control condition. ### Hypotheses # • H1: If emotions are contagious via pure exposure to verbal expressions, then compared to their control group, Group A will be less positive, reflected by posting fewer positive content than before) and Group B will be less negative, reflected by posting fewer negative content than before). • H2: “Opposite emotion should be inversely affected” (p. 8789): Group A should express increased negativity, and Group B should express increased positivity. ### Thoughts # • It’s interesting that in people’s own status updates during the experimental period, only 3.6% were positive and 1.6% negative. However, for posts in people’s News Feed, 46.8% were positive and 22.4% were negative. Why was it that News Feed posts were so much more emotional than people’s own status updates? Is it because Facebook’s algorithms like to show more emotional content to its users? I guess so. ## 2020-10-26 # Continue with Kramer et al. (2014) ### Why is this study needed? # Correlational studies cannot answer this question since it cannot support causality. Controlled experiments can support causality, but they have these problems: 1. Exposure is not equal to interaction. In a controlled experiment, mood change might come from interacting with a happy/sad person, rather than simply being exposed to that person’s mood; 2. Nonverbal cues are unavoidable in a controlled experiment, thus making it impossible for us to disentangle the effect of verbal cues. Therefore, this study makes unique contributions to answering this question. ### Study design # • Two parallel experiments: In experiment A, people see less positive emotional content whereas in experiment B, people see less negative emotional content. Both had a control condition, in which posts had an equal chance (see below) of being omitted, randomly (i.e., without considering their emotional valence). • How much less? Good question! According to the authors, “each emotional post had between a 10% to 90% change (based on their User ID) of being omitted from their News Feed …” • Well, how do you categorize a post as positive or negative? Awesome question. If a post contains at least one positive word as defined by LIWC2007, then it is a positive post. The same is for negative posts. ## 2020-10-25 # Kramer, A. D., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111(24), 8788-8790. ### Key question # The key question this paper was trying to answer: does exposure to mood expressed in the News Feed on FB change the content people post (that reflects their mood changes)? Or in the authors' own words, “whether exposure to verbal affective expressions lead to similar verbal expressions, a form of emotional contagion.” ## 2020-10-24 # Continue reading Lawrence (2007). ### What problems can these measures cause: # • Authors might 1) complicate their methods section so that it’s difficult for reviewers to fault it; 2) hide results that do not fit with their arguments; 3) split the findings into multiple papers even if one paper is enough to cover all the results; 4) compress the results to meet the requirements of top journals like Nature or Science; 5) hype their work. • I would add one: p-hacking. • The point that authors might complicate their methods section resonates with me strongly. After reading papers each day for over 6 weeks, I felt that the methods section of some papers is so dense and complicated that, if I were the reviewer, I didn’t have that much time and effort to decode it! This, I think, is really a problem. As I mentioned multiple times, I feel the most ideal studies are those with simple methodology and yet impactful results. A perfect example is Professor Duncan Watts and Steven Strogatz’s masterpiece of Collective dynamics of ‘small world’ networks . • Students have fewer opportunities to learn and fail. Since publication is so important, group leaders may end up writing students' work. • I don’t think this is true in social sciences. • Scientists spend a large portion of their time networking, which might bring them more co-authors, and leave a positive impression on journal editors. ## 2020-10-23 # Lawrence, P. A. (2007). ### How science and scientists are assessed today: # • Impact factors: Journals are evaluated based on their impact factors. Schools, departments and scientists “are assessed according to the impact factors of the journals they published in” (p. R583). • Number of citations: Scientists are evaluated according to the number of citations their publications receive. ### Why these measures are flawed: # • Impact factors (IFs): IFs reflect how many times, on average, each paper in a given journal gets cited in the two years following its publication. There are two problems with this measurement: 1) IF is about the journal, not about your paper. Even if your paper is flawed, or even wrong, it’s still something you can boast, if it gets published in a top journal; 2) Important findings may receive very few citations within two years since its publication. • Number of citations: 1) People may cite papers simply because of convenience or visibility, not because of the significance of the studies. Many people don’t even need to read the papers they cite. 2) Because citations are so important these days, there might be unethical behavior involved. For instance, gatecrashing names by providing a reagent or data without actually participating in the study, or simply by power or authority. ### What problems can these measures cause: # • Paper chase: Scientists spend so much time on writing and reviewing for, and submitting to top journals that they don’t have much time left on solving scientific problems; • Scientists will dodge uncharted areas and unpopular topics which are too risky. ## 2020-10-22 # 1. Finished Guo et al. (2014) I admire this piece of research very much. Again, it’s the ideal kind I am striving for: simple, straightforward, easy to understand, and yet impactful. 1. Lawrence, P. A. (2007). The mismeasurement of science. Current Biology, 17(15), R583-R585. pp. 1-2. It is not so funny that, in the real world of science, dodgy evaluation criteria such as impact factors and citations are dominating minds, distorting behaviour and determining careers. Citations are determined more by visibility and convenience than by the content or quality of the work. ## 2020-10-21 # Guo, P. J., Kim, J., & Rubin, R. (2014, March). How video production affects student engagement: An empirical study of MOOC videos. In Proceedings of the first ACM conference on Learning@ scale conference (pp. 41-50). pp. 1-7. ### If I am asked to design a course for Coursera, I’d better: # 1. Segment videos into short chunks (< 6 minutes); 2. Have my head recorded. Presentations should be inserted at opportune times or simply be presented with a picture-in-picture view; 3. Film in an informal setting where I can make eye contact with the potential audience, just like in an office hour talk; 4. If I don’t want my head to be filmed, I’d better use Khan-style tutorials rather than slides; 5. Plan my lessons “specifically for an online video format” (p. 10) [Edited on 2020-10-22]. ## 2020-10-20 # Börner et al. (2018). Skill discrepancies between research, education, and jobs reveal the critical need to supply soft skills for the data economy. Proceedings of the National Academy of Sciences, 115(50), 12630-12637. Main takeaway: Soft skills are in high demand by the industry. My issue: I like data viz. However, I feel the visualizations in this paper are a little bit too much. ## 2020-10-19 # Fei-Fei, L., & Perona, P. (2005, June). A bayesian hierarchical model for learning natural scene categories. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) (Vol. 2, pp. 524-531). IEEE. It’s all Greek to me. ## 2020-10-18 # Larivière, Ni, Gingras, Cronin, & Sugimoto. (2013). Bibliometrics: Global gender disparities in science. Nature News, 504(7479), 211. Barriers to women in science remain widely spread worldwide. Main takeaways: • In the most productive countries, papers with women in dominant author positions, i.e., sole author, first author, and last author, are cited less than those with men in the same positions; • South America and Eastern Europe had greater gender parity in terms of proportion of authorships. • Disciplines dominated by women all have to do with “care”, for example, nursing; speech, language, and hearing; education. • Natural sciences and humanities are dominated by men. Social sciences had a higher proportion of female authors. • “Female collaborations are more domestically oriented than are the collaborations of males from the same country” (p. 213) My issue: How did the authors assign gender to each author? It seems to me a very daunting task, especially when the names are of a non-Western origin. ## 2020-10-17 # Geman, D., & Geman, S. (2016). Opinion: Science in the age of selfies. Proceedings of the National Academy of Sciences, 113(34), 9384-9387. My thoughts are here . ## 2020-10-16 # Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: traps in big data analysis. Science, 343(6176), 1203-1205. Major takeaway: Big data research can learn from, and collaborate with small data research, which offers data that is not contained in big data. I started to think about my selfie studies. Specifically, I looked at 1) whether there are cultural differences between Chinese women’s selfies on China’s Weibo, and White Women’s selfies on Twitter. For example, is it true that Chinese women focus on their face whereas White women focus on their body in their selfies? Do Chinese women’s selfies show more cuteness? I also looked at 2) whether there are gender differences between men’s selfies and women’s selfies. For example, do women show more self-touching in selfies? I used a small-data approach. Although I downloaded over 30,000 images from Twitter and 8,000 images from Weibo, I only selected 200 from each platform for analysis, simply because I didn’t have that much manpower to analyze them all. Talking about big data and small data research, I think I can combine the two here. Human content analysis can offer some insights and then directions for big data research. After all, there are so many things to detect in a selfie: the gender of the person, his or her mood, surroundings, posture, facial expressions, etc. Deep learning algorithms need some directions so that they can give us the analysis we need. ## 2020-10-15 # Finished Bollen, Mao, & Zeng. (2011) ## 2020-10-14 # 1. Finished Giles (2012). Thoughts: Professor Granovetter is right in pointing out that data itself might not help us have a deeper understanding of our society. After all, his seminal paper on “weak ties” is based on theoretical thinking rather than data. What are most of the data in research papers used for? To test theories. But theories arise from thinking, not data. Data is limited. It’s extremely difficult for most scholars to get high-quality large-scale data. That shouldn’t become a barrier to theoretical advances. Scholars who cannot get access to quality data can focus on theoretical thinking. 1. Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of computational science, 2(1), 1-8. ## 2020-10-13 # 1. Finished Sarma & Kay (2020). Major take-aways: • “Weakly informed priors” are popular among scholars practicing Bayesian inferences. However, scholars might have different interpretations of this concept and different strategies to implement it. • Innovative prior elicitation interfaces can assist novice Bayesian practitioners set priors. 1. Giles, J. (2012). Making the links. Nature, 488(7412), 448-450. pp. 448-449. ## 2020-10-12 [Completed on 2020-10-13] # Sarma & Kay. (2020, April). Prior Setting In Practice: Strategies and rationales used in choosing prior distributions for Bayesian analysis. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-12). pp. 1-8 ## 2020-10-11 # Shen & Williams (2011). Unpacking time online: Connecting internet and massively multiplayer online game use with psychosocial well-being. Communication Research, 38(1), 123-149. Main Takeaway: The psychological impacts of Internet activities are nuanced. ## 2020-10-10 # Finished Centola & Macy. (2007). Main Takeaway: The strength of weak ties should not be simply generalized to complex contagions, which requires affirmation from multiple sources. Therefore, not only the length, but also, and maybe more importantly, the width of the ties influences complex contagions. ## 2020-10-09 # Centola, D., & Macy, M. (2007). Complex contagions and the weakness of long ties. American journal of Sociology, 113(3), 702-734. p. 702 - p.711 ## 2020-10-08 # ### Centola, D. (2010). # 1. Within the unstructured condition, there are more non-obese adopters than obese adopter, both in terms of number and percentage; 2. Across conditions: homophily boosted adoption among both the obese (P < 0.01) and the non-obese people (P < 0.05), using Mann-Whitney U test. We can see that homophily had a significant effect on adoption of healthy behaviors. However, is it because obese people are more likely to be exposed to the behavior, or those who are exposed are more likely to adopt these behaviors in a homophilous group? 1. It turns out that within both conditions, the relative percentage of the obese and the non-obese did not differ significantly. 2. Across conditions: homophily boosted both the number and the fraction of the obese who were exposed to the behavior (P < 0.05), using Mann-Whitney U test. This happened despite that obese people initially had greater exposure in the unstructured networks. 3. Did homophily affect the adoption rate among those exposed? The effect was significant among the exposed obese people (P < 0.01), using Mann-Whitney U test, but not among the exposed non-obese individuals. I like this study: simple, and impactful. Re-reading on 2020-11-22 ### Literature # Homophily is defined as “the tendency of social contacts to be similar to one another”. Although research on diffusion, and that on social influence differ over the effects of homophily on behavior spreading at the dyadic level, both agree that homophily decreases adoption at the network level. This is because, obviously, the more homophilous one’s network is, he is less likely to be exposed to individuals of a different characteristic. If you are a less healthy person, and you find yourself in a homophilous network, it’s less likely for you to be aware of what those healthy guys are doing. ### Purpose # To study the effect of homophily on the adoption of healthy behavior ### Design # 710 participants are randomly put into two conditions: homophilous population condition, and unconstructed population condition. The homophilous population condition consists of people having similar individual characteristics (gender, age, and BMI) whereas people in the other condition are random and mixed. All networks in the study have the same size (= 72), clustering coefficient (= 0.4), and degree distribution (= 6). The only difference is the level of homophily. The study consisted of five trials, each having two social networks. All these trials ran at the same time, for seven weeks. The healthy behavior to be adopted is to write a diet diary online. The seed of the behavior in all networks is a “healthy” individual. At the start of Week 1, the author activated the seed nodes simultaneously. Once an individual signs up, their neighbors will be notified via email. ### Results # Across all five trials, people in homophilous condition had a higher adoption rate than those in unconstructed condition. Comparing adoption among obese and nonobese individuals: Within homophilous condition, a greater percentage of obese individuals adopted the behavior than that of nonobese people. In unconstructed condition, both the number of and percentage of nonobese adopters were larger than obese adopters. In fact, there was no obese adopters in the unconstructed group at all. Comparing the two conditions: homophily increased the adoption among both the obese and nonobese people. However, from these results, we cannot say for sure that homophily is the reason. For example, it may be that in the homophilous condition, obese people have more neighbors who sign up and thus have more exposure. It may also be because that homophily increases the likelihood for people to sign up once they are exposed. Therefore, we need to compare 1.) exposure, and 2.) the likelihood to adopt once exposed in the two conditions. It showed that, within each of the two conditions, the percentage of exposed obese people (num. of exposed obese / total num. of obese ppl) and that of exposed nonobese people (num. of exposed nonobese / total num. of nonobese ppl) do not differ significantly. Across conditions, homophily significantly increased the percentage of exposed obese people. See Fig. 2D. How about the likelihood? within conditions, the likelihood to adopt after exposure was much higher for the obesed than for the nonobesed. Acorss conditions, homophily significantly increased obese people’s likelihood to adopt after exposure. Therefore, homophily increased obese people’s access to, and the likelihood to adopt healthy behavior. … low adoption levels of health innovations among less healthy individuals may be a function of social environment rather than a baseline reluctance for adoption. ## 2020-10-07 # 1. Finished Eubank et al. (2004) Time of withdrawal to the home is by far the most important factor (in a disease outbreak in cities), followed by delay in response. This indicates that targeted vaccination is feasible when combined with fast detection. Ironically, the actual strategy used is much less important than either of these factors. – Eubank et al. (2004) 1. Centola, D. (2011). An experimental study of homophily in the adoption of health behavior. Science, 334(6060), 1269-1272. • Within the homophilous condition, a higher percentage of obese people than non-obese people adopted the behavior (P < 0.05). P.1270 ## 2020-10-06 # 1. Finished Schmälzle et al. (2017). Main findings: • Social exclusion correlates increased connectivity in the brain’s mentalizing system; • When excluded, people whose friends are sparsely connected with each other showed increased connectivity within key brain systems. Overall, social exclusion / inclusion is related to connectivity within one’s brain networks. Also, the density of one’s friendship network has an effect on the connectivity change. 1. Eubank et al.(2004). Modelling disease outbreaks in realistic urban social networks. Nature, 429(6988), 180-184. ## 2020-10-05 # Schmälzle et al. (2017). Brain connectivity dynamics during social interaction reflect social network structure. Proceedings of the National Academy of Sciences, 114(20), 5153-5158. p. 5153 -p.5156 ## 2020-10-04 # Finished Chambliss. (1989). ## 2020-10-03 # Superlative performance is really a confluence of dozens of small skills or activities, each one learned or stumbled upon, which have been carefully drilled into habit and then are fitted together in a synthesized whole. — Chambliss, D. F. (p. 81) ### Excellence requires qualitative differentiation. # Those who are more successful are doing different things, rather than more of the same things. Quantitative changes do bring success, but only within the world you are currently in. You cannot go to another world by doing more of what you have been doing. Those who are top performers are better to be seen as different rather than as better. ### Talent is not the reason for excellence. # 1. First of all, factors other than talent predict success more precisely. 2. Second, you cannot distinguish talent from its effects, i.e., you cannot realize there is talent until someone succeeds. 3. Third, the amount of talent needed for excellence is surprisingly small. ### Excellence is mundane. # 1. Success is ordinary. Success is simply doing small tasks consistently and correctly. Note : Below are the notes on 2020-10-04 1. Motivation is also ordinary. Gold medalists did not think too far ahead. Instead, they focused on the most immediate goals, the so-called “small wins”. For example, Steve Lundquist, who won two gold medals in swimming in the Los Angeles Olympics, set a goal that he would win every single swim in every single practice. Small wins added up to excellence and success. 2. Don’t take what you do as too important. You should maintain mundanity. If you are going to deliver a commencement speech in front of an audience of thousands, you should know that almost nobody cares about nor remembers what you have to say. When you are writing your doctoral thesis, you should also be aware that few people will read what you write. ## 2020-10-02 # Chambliss. (1989). p7-p12. ## 2020-10-01 # 1. Finished Bullmore & Sporns. (2009) 2. Chambliss, D. F. (1989). The mundanity of excellence: An ethnographic report on stratification and Olympic swimmers. Sociological theory, 7(1), 70-86. p2 -p7. # 2020-09 # ## 2020-09-30 # Bullmore & Sporns. (2009). p6-p9. ## 2020-09-29 # Bullmore, E., & Sporns, O. (2009). Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews neuroscience, 10(3), 186-198. p1-p6. ## 2020-09-28 # Stivers et al. (2009) There is a universal pattern for turn-taking. People aim to minimize gaps and overlap in conversations. • Slower: 1. Nonanswer responses 2. Disconfirmation responses 3. Responses without a visible component (e.g., head nods shrugs, head shakes, blinks, or eyebrow flashes) • Faster: Questions with gaze from the questioner ## 2020-09-27 # 1. Stivers et al. (2009). Universals and cultural variation in turn-taking in conversation. Proceedings of the National Academy of Sciences, 106(26), 10587-10592. 2. Liljeros et al. (2001) • For both males and females, the cumulative distribution of the number of partners in the previous 12 months almost perfectly followed a straight line, indicating scale-free power-law characteristics; • For both genders, the cumulative distribution of the total number of sexual partners in the entire lifetime followed a straight line only when $k > 20\$.

• The network of sexual partners is a scale-free one, meaning that you cannot assume, for example, 90% of the individuals have 3 - 10 partners. This is simply because there is no inherent scale. It’s a crazy world, literally. I cannot believe that there are people who have over 100, even 1000 partners in their lifetime. Isn’t this a crazy world?

Other notes:

• Thanks to this paper, I now know that for a power-law distribution to show a straight line, I need to use CDF (cumulative distribution function)

• One thing I didn’t understand is how could the authors conclude that “the rich get richer” by simply looking at Figure 2a? I don’t think it is a rigorous remark.

## 2020-09-26 #

1. Del Vicario et al. (2016).

This paper is a little bit too technical for me, especially the second part that involves modeling. Also, I had difficulty understanding the conceptualization of “homogeneity” and “polarization”.

Major takeaways from this paper:

• Information on social media quickly reaches in 2 hours around 20% of the people it can reach in the end, and reaches in 5 hours around 40%. This is true for both science and rumors.

• Science news is usually quickly diffused. However, long-lasting interest doesn’t correspond to the size of the interest. This means, even though people keep sharing it, not a lot of people will be interested in it.

• Conspiracy rumors diffused slowly and its cascade size is positively correlated with its lifetime. Meaning that the longer it lasts, the more people become interested in it.

1. Liljeros, F., Edling, C. R., Amaral, L. A. N., Stanley, H. E., & Åberg, Y. (2001). The web of human sexual contacts. Nature, 411(6840), 907-908.

This is the kind of study I admire: short, interesting, and impactful.

## 2020-09-25 #

Del Vicario et al. (2016).

## 2020-09-24 #

1. Bakshy, E., Messing, S., & Adamic, L. A. (2015). Exposure to ideologically diverse news and opinion on Facebook. Science, 348(6239), 1130-1132.
• Among 7 million distinct URLs shared by 10 million Facebook users in the US, 13% were hard news;

• Around 20% of a person’s friends had the opposite political affiliation;

• Liberals had fewer friends who shared news from the other side;

• Controlling for the position of the news feed, it seemed conservatives were more likely to click on cross-cutting content, i.e., news that came from the other side; This result surprised me.

1. Del Vicario et al. (2016). The spreading of misinformation online. Proceedings of the National Academy of Sciences, 113(3), 554-559.

## 2020-09-23 #

Finished Kay et al. (2016).

Helping researchers in different fields set priors might be something worth doing in the future.

## 2020-09-22 #

1. Hullman et al. (2017)

2. Kay, M., Nelson, G. L., & Hekler, E. B. (2016, May). Researcher-centered design of statistics: Why Bayesian statistics better fit the culture and incentives of HCI. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 4521-4532).

• Bayesian approaches make knowledge accrual possible without meta-analysis approaches

• Even though scholars use effect size and confidence intervals, the ultimate goal of looking for small ps will ruin everything.

(p.4)

## 2020-09-21 #

Hullman, J., Kay, M., Kim, Y. S., & Shrestha, S. (2017). Imagining replications: Graphical prediction & discrete visualizations improve recall & estimation of effect uncertainty. IEEE transactions on visualization and computer graphics, 24(1), 446-456.

Continue from 2nd para. of 3.2 (Evaluations with Users) tomorrow.

## 2020-09-20 #

Vosoughi et al. (2018)

The work is indeed significant. It compared the spreading of true and false news on Twitter and concluded that the false spread faster, deeper, and farther than the truth. False political news, in particular, is diffused especially broadly and deeply.

• Was it because those who spread the false were more influential or active?

Not really. Those who spread false news had fewer followers, followed fewer people on Twitter, were less likely to be verified, and had been on Twitter for less time.

• Was it because false news was more noval and users are more likely to retweet information with more novelty?

• False rumors were indeed more novel than the truth;
• False news was objectively more novel, but did users get it?
• Yes, replies to false news showed greater surprise and disgust, whereas the truth inspired more sadness and joy.
• Was it because of selection bias? I mean, the tweets from the six organizations might not be representative of all tweets.

• The authors verified a second sample of Tweets, which were labeled by three undergraduate students as true, false, or mixed. Again, the results were the same.
• Did false news spread faster, deeper, farther, and more broadly because of bot activities? I mean, was it because of bots that crazily retweeted and replied to false news?

• Two bot-detection algorithms were applied independently to detect and remove bots before data analysis. Results were the same. This has significant implications: that false news traveled faster and farther not because of bots, but because of humans.

### I had several issues: #

At first glance, data visualization in this article is good. However, most of the figures used only red and green and therefore are not friendly to color-blinded people.

1. Content analysis

They should report Krippendorff’s alpha rather than an agreement of 90%, I believe.

1. No hypotheses beforehand

## 2020-09-19 #

Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146-1151.

## 2020-09-18 #

Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots. Communications of the ACM, 59(7), 96-104.

## 2020-09-17 #

Hilbert, M., & López, P. (2011). The world’s technological capacity to store, communicate, and compute information. Science, 332(6025), 60-65.

## 2020-09-16 #

González-Bailón, S., Borge-Holthoefer, J., Rivero, A., & Moreno, Y. (2011). The dynamics of protest recruitment through an online network. Scientific reports, 1, 197.

• Study goal: Study whether and how social network sites encourage recruitment in social movements.

• Why wasn’t it published on Nature or Science: A first look at this paper made me feel that it should have published on Nature or Science. I believe the authors must have tried. After reading the whole paper, I concluded that lack of sufficient evidence might have been the reason why it didn’t manage to do so. As the authors have mentioned in their limitations part, there were so many factors other than Twitter that influenced the movement in question, and it was impossible to single them out.

## 2020-09-15 #

Lazer et al. (2009). Life in the network: the coming age of computational social science. Science, 323(5915), 721.

The potential of computational social science and how to make preparations for its future.

## 2020-09-14 #

p 1-3. Lazeret et al. (2018). The science of fake news. Science, 359(6380), 1094-1096.

• Increasing partisan preferences in the US created a context for fake news to attract huge audiences;

• We don’t know the exact ratio of fake news against real news, and we don’t know the medium-to-long-run effect of exposure to fake news on people’s attitudes.

• Bots on social media are hard to detect. Once a detecting technique is developed, bots will upgrade themselves.

• Possible interventions:

1. Encouraging people to use fact checking. However, we are not sure whether this is useful or not, partly due to people’s confirmation bias and desirability bias.
2. Internet oligopolies should collaborate with academia to understand how pervasive fake news is. Also, these oligopolies' power should be contained by, for example, legal systems.

## 2020-09-13 #

Lazer et al. (2020)

• Definition:

• Computational social science: language, location, movement, networks, images, and video, using statistical models that capture multifarious dependencies.
• Problems

1. Interdisciplinary research not encouraged enough, especially that involve cooperation between social and computer scientists, due to unfavorable policies at universities;
2. Proprietary data unavailable to researchers.
3. Available data is not intended for research and won’t be shared with other researchers, which impedes reproducibility.
4. Lack of regulatory guidance from university IRBs about collecting and analyzing sensitive data.
• Recommendations

1. Collaborate and negotiate with private companies for data;
2. Build infrastructures that provide data as well as preserve participants' privacy;
3. Develop new ethical guidelines;
4. Reorganize universities so that 1) multi-disciplinary collaboration is professionally or financially rewarded, and 2) enforce ethical research
5. Researchers make sure that they do public good.

## 2020-09-12 #

1. p1. Lazer et al. (2020). Computational social science: Obstacles and opportunities. Science, 369(6507), 1060-1062.

2. Recapping Centola (2010):

• Contribution: An experimental design that ran contrary to previous findings regarding the strength of weak ties.
• Conclusion: networks with local clustering are conducive to behavioral diffusion.
• Method: An experiment with two groups. One group found themselves in a random network, and the other group in a clustered-lattice network. Degree distribution of the two networks is identical.
• Why could it be published on Science: Maybe the first empirical test of two competing hypotheses regarding the effect of network topology on behavior spreading.
• My question: I didn’t see many long ties in the “small-world network” in Figure 1.
• Improvements: I didn’t know all of the statistical tests used in this paper. I know the Mann-Whitney U test but I don’t know Kolmogorov-Smirnov. I am wondering whether the study could be conducted using Bayesian statistics.

## 2020-09-11 #

1. p5-p12. Cha et al. (2007, October).
2. p1-p4. Centola, D. (2010). The spread of behavior in an online social network experiment. Science, 329(5996), 1194-1197.

## 2020-09-10 #

• p1-p4. Cha et al. (2007, October). I tube, you tube, everybody tubes: analyzing the world’s largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement (pp. 1-14).