Expanding the Circle: Mapping out the circimplex R package
At SITAR 2018, the circumplex package for the R software environment (R Core Team, 2018) was announced, and I presented a tutorial on using it to analyze circumplex data with the SSM approach (Zimmermann & Wright, 2017). The package has seen several updates since then and (at the risk of indulging in navel-gazing) I wanted to take this opportunity to provide a brief history of the project, review its progress thus far, and provide a tentative roadmap for its future. I am also hoping to solicit feedback/ engagement from the community and attract additional collaborators to the project.
As a companion to Zimmermann & Wright (2017), an R package called ssm was uploaded to Aidan’s website. This package had functions for computing several types of SSM models and producing circumplex visualizations with confidence bands. However, it also had a stubborn graphical bug where the confidence bands sometimes appeared in the wrong places. My involvement in the project began with an attempt to fix this bug, and over time the idea to expand the project beyond SSM models began forming. What ultimately decided me was that the ssm package name was already taken on CRAN (the primary repository for R packages) by an unrelated project. Thus, the circumplex package was born.
The early versions of the package focused on generalizing the SSM approach and generating corresponding figures (sans bug) and, for the first time, formatted results tables that could be copied directly from R into a word processor. Several months after SITAR 2018, the package was stable and standardized enough to be accepted to CRAN, and a website was launched to provide package documentation and vignette articles walking users through the use of the SSM functions.
The first major update to the package (v0.2.0) came several months later and added detailed documentation for many of the most popular circumplex questionnaire instruments as well as functions for scoring and ipsatizing item-level data and for standardizing scale-level data. Compiling all this information and contacting the instrument authors and copyright-holders was a large undertaking, and I am grateful to Sindes Dawood for her help in this process. I am also grateful to the authors and copyright-holders who graciously agreed to include this information in the package and on the website, which is now an easily accessible resource for learning about available circumplex instruments. My hope is that the website will aid in the discovery and adoption of circumplex instruments, and the instrument-related functions will enhance the accuracy and consistency of their implementation.
I have many hopes and plans for the future of the circumplex package and will share three of them here. I can’t promise that all these things will happen (and even less-so when), but my goal in sharing them is to get the community thinking about what tools they would most like to see.
The first stop planned on the road ahead is to implement existing statistical techniques for assessing the fit of circumplex models to data. Some approaches already exist in R but many are deprecated (i.e., no longer supported) and it would be beneficial to collect them in a centralized location and standardize them (e.g., in terms of formatting, naming, defaults, and documentation). An important component of this will be to reimplement the now-deprecated CircE package so that estimates of the probability of correct confidence intervals can be added back into the SSM results.
The next planned stop is to build a fully featured extension to the ggplot2 package (Wickham, 2016) for plotting data in a circumplex coordinate system. SSM plots are currently created in ggplot2 using an expedient approach that limits the degree to which they can be customized. Developing a fully featured extension would be a huge undertaking but would enable exciting new visualization options in addition to improving customizability. If implemented and documented well, such an extension would have the potential to become the de facto approach to plotting circumplex data for years to come.
Finally, the third planned stop is to refine existing methods (e.g., SSM and CircE). Implementing an approach like the SSM involves choosing between many methodological options (e.g., which type of bootstrap to use, which type of confidence intervals to calculate, and how to handle missing data). Understanding the strengths and weaknesses of these options requires simulation studies and likely more statistical expertise than I currently have, so statistical collaboration is very much requested.
If you want to contribute to the circumplex package through programming, I’d be more than happy to facilitate this and can assign duties that are appropriate to whatever skill level you are at. Doing so would be a great way to learn more about R and an easy introduction to package development. However, I want to be clear that contributing to this package (or any other) does not require statistical or programming expertise. Open-source software developers always (and often desperately) need help from users to find and report bugs, copyedit and improve documentation, and provide feedback on which features have been and would be most useful. Everyone is welcome and will be respected.
Please feel free to contact me at [email protected] with any questions or comments.
References
R Core Team. (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis (use R!) (2nd ed.). New York, NY: Springer.
Zimmermann, J., & Wright, A. G. C. (2017). Beyond description in interpersonal construct validation: Methodological advances in the circumplex Structural Summary Approach. Assessment, 24(1), 3–23. https://doi.org/10/f9fz5d
As a companion to Zimmermann & Wright (2017), an R package called ssm was uploaded to Aidan’s website. This package had functions for computing several types of SSM models and producing circumplex visualizations with confidence bands. However, it also had a stubborn graphical bug where the confidence bands sometimes appeared in the wrong places. My involvement in the project began with an attempt to fix this bug, and over time the idea to expand the project beyond SSM models began forming. What ultimately decided me was that the ssm package name was already taken on CRAN (the primary repository for R packages) by an unrelated project. Thus, the circumplex package was born.
The early versions of the package focused on generalizing the SSM approach and generating corresponding figures (sans bug) and, for the first time, formatted results tables that could be copied directly from R into a word processor. Several months after SITAR 2018, the package was stable and standardized enough to be accepted to CRAN, and a website was launched to provide package documentation and vignette articles walking users through the use of the SSM functions.
The first major update to the package (v0.2.0) came several months later and added detailed documentation for many of the most popular circumplex questionnaire instruments as well as functions for scoring and ipsatizing item-level data and for standardizing scale-level data. Compiling all this information and contacting the instrument authors and copyright-holders was a large undertaking, and I am grateful to Sindes Dawood for her help in this process. I am also grateful to the authors and copyright-holders who graciously agreed to include this information in the package and on the website, which is now an easily accessible resource for learning about available circumplex instruments. My hope is that the website will aid in the discovery and adoption of circumplex instruments, and the instrument-related functions will enhance the accuracy and consistency of their implementation.
I have many hopes and plans for the future of the circumplex package and will share three of them here. I can’t promise that all these things will happen (and even less-so when), but my goal in sharing them is to get the community thinking about what tools they would most like to see.
The first stop planned on the road ahead is to implement existing statistical techniques for assessing the fit of circumplex models to data. Some approaches already exist in R but many are deprecated (i.e., no longer supported) and it would be beneficial to collect them in a centralized location and standardize them (e.g., in terms of formatting, naming, defaults, and documentation). An important component of this will be to reimplement the now-deprecated CircE package so that estimates of the probability of correct confidence intervals can be added back into the SSM results.
The next planned stop is to build a fully featured extension to the ggplot2 package (Wickham, 2016) for plotting data in a circumplex coordinate system. SSM plots are currently created in ggplot2 using an expedient approach that limits the degree to which they can be customized. Developing a fully featured extension would be a huge undertaking but would enable exciting new visualization options in addition to improving customizability. If implemented and documented well, such an extension would have the potential to become the de facto approach to plotting circumplex data for years to come.
Finally, the third planned stop is to refine existing methods (e.g., SSM and CircE). Implementing an approach like the SSM involves choosing between many methodological options (e.g., which type of bootstrap to use, which type of confidence intervals to calculate, and how to handle missing data). Understanding the strengths and weaknesses of these options requires simulation studies and likely more statistical expertise than I currently have, so statistical collaboration is very much requested.
If you want to contribute to the circumplex package through programming, I’d be more than happy to facilitate this and can assign duties that are appropriate to whatever skill level you are at. Doing so would be a great way to learn more about R and an easy introduction to package development. However, I want to be clear that contributing to this package (or any other) does not require statistical or programming expertise. Open-source software developers always (and often desperately) need help from users to find and report bugs, copyedit and improve documentation, and provide feedback on which features have been and would be most useful. Everyone is welcome and will be respected.
Please feel free to contact me at [email protected] with any questions or comments.
References
R Core Team. (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis (use R!) (2nd ed.). New York, NY: Springer.
Zimmermann, J., & Wright, A. G. C. (2017). Beyond description in interpersonal construct validation: Methodological advances in the circumplex Structural Summary Approach. Assessment, 24(1), 3–23. https://doi.org/10/f9fz5d