Integrity in Social Simulation: Referencing Data Sources, Codebases, and Human Behavior
In the expanding world of computational social science, social simulation has become a vital methodology for studying complex social phenomena. Whether through agent-based modeling, microsimulation, or hybrid frameworks, simulation-based research enables scholars to test theories, replicate real-world dynamics, and explore scenarios that would be difficult or impossible to study empirically. But as this field grows in influence, so does the need to address a foundational concern: how do we ensure academic integrity through proper citation of data sources, simulation code, and modeled behaviors?
Why Integrity Matters in Simulation-Based Research
Academic integrity is more than a formality; it's the foundation of scholarly communication. In traditional research, citation:
- Acknowledges prior work
- Allows replication and verification
- Demonstrates methodological rigor
In simulation-based research, this extends further. Researchers not only build on conceptual frameworks but also rely heavily on:
- Data sets (demographic, economic, behavioral)
- Open-source or custom codebases
- Empirical studies on human behavior and decision-making
Failure to reference these components accurately can undermine the reproducibility, trustworthiness, and credibility of simulation findings.
What Needs to Be Cited in Social Simulation?
Empirical Data: Any dataset used to calibrate or validate models
Simulation Code: Original or borrowed code and software tools
Theoretical Models: Schemas or assumptions drawn from existing research
Human Behavior Frameworks: Models based on psychology, sociology, or economics
Visualization and Analysis Tools: Libraries or scripts used for post-processing
Citing Code and Simulation Software
Referencing code in social simulation requires a blend of software citation and academic norms. When reusing or modifying code:
- Include the GitHub repo, version number, and license
- Mention the original author(s)
- Provide a DOI if available (e.g., via Zenodo)
Example (APA Style):
Smith, A. (2024). NetSim Framework (Version 2.0) [Computer software]. GitHub. https://github.com/alicesmith/netsim
For in-house or unpublished simulation frameworks, offer a detailed appendix or supplementary material that documents the code architecture and assumptions.
Referencing Human Behavior in Models
Agent-based and microsimulation models often rely on empirical findings about human decision-making. These may include:
- Heuristics and bounded rationality
- Social norms and peer influence
- Economic preferences or biases
If these are based on specific studies, they must be properly cited. For instance, using Kahneman and Tversky’s Prospect Theory in an agent’s decision function should reference the original source, not just a derivative interpretation.
Example (MLA Style):
Kahneman, Daniel, and Amos Tversky. "Prospect Theory: An Analysis of Decision under Risk." Econometrica, vol. 47, no. 2, 1979, pp. 263–291.
Which Citation Style to Use?
Simulation research is inherently interdisciplinary. This complicates citation, as different disciplines favor different styles:
Discipline / Context | Recommended Style | Key Reason |
---|---|---|
Psychology, Sociology, Education | APA | Emphasizes date of research; clear structure for empirical studies |
Literature, Arts, Language Studies | MLA | Focus on author and page number; ideal for textual analysis |
History, Theology, Philosophy | Chicago (Notes & Bibliography) | Detailed footnotes; rich contextual referencing |
Publishing, Journalism, Mixed Academic Fields | Chicago (Author–Date) | Flexible for multidisciplinary work; similar to APA but more versatile |
Undergraduate Assignments (General) | APA or MLA | Simple formatting; easier for students to apply consistently |
Simulation-Based Research | APA or Chicago | APA for empirical analysis; Chicago for archival/sourced modeling |
For an in-depth comparison, consult the Research, Writing & Integrity blog at icai-me.com.
Special Considerations for Simulation-Based Studies
1. Simulation Output as Data
Treat simulation output (e.g., behavioral patterns, emergent dynamics) like generated data. Archive results in a citable repository and reference accordingly.
2. Multi-Author Code Contributions
If your model uses multi-author libraries (e.g., Mesa in Python), cite the software and its contributors.
3. Ethically Sensitive Modeling
When simulating topics like crime, health, or migration, ethical citation becomes essential. Reference the communities studied and disclose limitations.
Tools for Better Citation in Simulation
Zotero and Mendeley: Store references for datasets and software
Zenodo: Archive code and get a DOI
JOSS (Journal of Open Source Software): Publish and cite simulation tools
OSF (Open Science Framework): Bundle simulation artifacts and documentation
Building Credibility Through Transparency
In social simulation, the credibility of your findings depends not only on the sophistication of your model but also on your transparency as a researcher. By properly citing datasets, code, behavioral theories, and tools, you uphold the ethical standards of science and empower others to validate, extend, or build upon your work.
Proper citation isn't bureaucratic overhead—it's a pillar of research integrity. Make it visible, make it precise, and make it part of your modeling workflow.