The forced swimming test is a method used in the assessment of depressive-like behavior in rodents. Changes in the original forced swimming test procedure developed by Porsolt et al. and their influence in the results is a controversial issue and has been discussed in many studies. Animal's behavior is usually recorded by partial interval recording, dividing the total recording time into equal intervals and manually recording the predominant behavior. Despite the influence of the recording method in the subsequent results, this issue has not been further studied nor normalized. The aim of this study was to assess whether the representativeness of the data is influenced by the recording interval length, by recording behaviors (immobility, swim and climbing) in the same subjects at 3, 5 and 10s recording intervals. We used a non-pathological sample of male and female adult Wistar rats. Our results show no differences in the use of these three recording intervals in the registration method of the forced swimming test, for the main three behaviors measured.
El test de natación forzada es un método utilizado para evaluar el comportamiento depresivo en roedores. Las modificaciones del procedimiento del test de natación forzada desarrollado por Porsolt y su influencia en sus resultados es un tema que suscita controversia y ha sido analizado en numerosos estudios. El comportamiento del animal se analiza generalmente mediante el registro de intervalos parciales, dividiendo el tiempo total de grabación en intervalos iguales y registrando manualmente la conducta predominante durante ese tiempo. A pesar de la influencia del método de registro en los resultados posteriores, esta cuestión no se ha analizado ni normalizado. El objetivo de este estudio fue comprobar si la representatividad de los datos se ve afectada por la longitud del intervalo de observación, registrando a 3, 5 y 10seg la conducta (inmovilidad, natación y escalada) de los mismos sujetos. Se utilizó una muestra no patológica de ratas Wistar macho y hembra adultas. Los resultados mostraron que no existen diferencias significativas entre estos intervalos analizados en la prueba de natación forzada para las 3 principales conductas registradas.
The forced swimming test (FST) is a method used in the assessment of depressive-like behavior in rodents that was developed by Porsolt, Bertin, and Jalfre (1977). It is based on learned helplessness, as the animal is repeatedly exposed to an aversive stimulus, water, which it cannot escape. When the animal is introduced in an inescapable cylinder filled with water, it learns that there is nothing it can do to escape and, therefore, it reduces the time of regular escape behaviors (swimming, climbing and diving) in subsequent trials and spends more time making no movements. In the animal model of depression, the total time of immobilization is higher than in control condition and it has been seen that antidepressants reduce the differences in total time of immobility between control and experimental groups (Porsolt, LePichon, & Jalfre, 1977).
An animal model of a particular psychological condition must meet certain criteria to be applicable to humans: it should resemble the human condition in its etiology, biochemistry, symptoms and treatment (McKinney & Bunney, 1969). As reviewed in the text of Belzung and Lemoine (2011), most authors have focused on these external validity criteria. In this review, five basic criteria of validity of an animal model, which differ slightly from the classical criteria, are proposed. These criteria are homological validity, pathogenic validity, mechanistic validity, face validity and predictive validity (Belzung & Lemoine, 2011). According to many of these criteria, major depressive disorder has been reproduced in animals in order to study the underlying neurobiological mechanisms. This model has been developed by replicating aspects of the depressive syndrome that are not intrinsically human features, i.e. anhedonia, helplessness and behavioral despair (Krishnan & Nestler, 2011).
The validity of the FST has been questioned many times. Several authors have compared the results of FST with other depression-measuring tests, such as sucrose preference test (Grillo et al., 2011; Hong et al., 2012; Karson, Demirtas, Bayramgürler, Balci, & Utkan, 2013), and have found consistent results. Nevertheless, there is controversy about the validity of FST to measure behavioral ‘despair’, as habituation has been proposed as an explanation to immobility (Hawkins, Hichs, Phillips, & Moore, 1978) due to a process of familiarization (Borsini, Volterra, & Meli, 1986). There are multiple variations on the original methodology of the FST that lead to differences in the results (Borsini & Meli, 1988). While most of the researchers record behaviors manually (Grillo et al., 2011; Kawai, Ishibashi, Kudo, Kawashima, & Mitsumoto, 2012; Sirianni, Olausson, Chiu, Taylor, & Saltzman, 2010; Ulloa, Díaz-Valderrama, Herrera-Pérez, León-Olea, & Martínez-Mota, 2014), others use automated devices (El-Alfy et al., 2010; Uz, Dimitrijevic, Imbesi, Manev, & Manev, 2008). Therefore, behavioral results used to be based on subjective recording.
Animal's behavior is usually recorded by partial interval recording (PIR), consisting on dividing the total recording time into equal intervals and manually recording the predominant behavior in each interval (Borsini & Meli, 1988). Some authors have also tried to automate the recording process by developing software that measures different mobility parameters, such as Ethovision 3.0 by Noldus (Hédou, Pryce, Di lorio, Heidbreder, & Feldon, 2001) or CVA software by ProTrack (Gersner, Gordon-Kiwkowitz, & Zangen, 2009), although the preferential method is still the trained observer. In Fiske and Delmolino (2012), advantages and disadvantages of different recording methods are mentioned. Using PIR, some authors conclude that the smaller the interval, the lower the absolute and relative errors (Wirth, Slaven, & Taylor, 2014). Also, it has been observed that the length of the recording interval inversely affects the representativeness of data (Repp, Roberts, Slack, Repp, & Berkler, 1976). Finally, regarding recording methods, it has been shown that PIR method is a more sensitive recording method than momentary time sampling (MTS) (Harrop & Daniels, 1986).
We have found in the literature (summarized in Table 1) that many studies using FST do not include information in their methodological section about the recording method. Most of the few authors who describe the recording method usually record behavior at intervals of 5s. There are exceptions such as Su, Hato-Yamada, Araki, and Yoshimura (2013), who quantify the duration and frequency of the behavior, and Grillo et al. (2011), who recorded at intervals of 3s. Despite the influence of the recording method in the subsequent results, this issue has not been further studied nor normalized.
Summary of the studies that used forced swimming test.
Authors | Recording method | Interval size (s) | Software |
---|---|---|---|
Citó et al. (2015) | Unspecified | ||
Martínez, Brunelli, and Zimmerberg (2015) | Unspecified | ||
Ulloa et al., 2014 | PIR | 5 | |
Kołaczkowski et al. (2014) | Unspecified | ||
Karson et al. (2013) | Unspecified | ||
Yang, Hu, Zhou, Zhang, and Yang (2013) | Unspecified | ||
Su et al. (2013) | Duration and frequency | ||
Kawai et al. (2012) | PIR | 5 | |
Hong et al. (2012) | PIR | 5 | |
Grillo et al. (2011) | PIR | 3 | |
El-Alfy et al. (2010) | Automated | SMART II Video Tracking | |
Sirianni et al. (2010) | PIR | 5 | |
Uz et al. (2008) | Automated | MotorMonitor 4.11 | |
Kuœmider, Solich, Pasach, and Dziedzicka-Wasylewska (2007) | PIR | 5 | |
Deak et al. (2005) | PIR | 5 | |
Holmes, Yang, Murphy, and Crawley (2001) | PIR | 5 | |
Lucki, Dalvi, and Mayorga (2001) | Unspecified | ||
Hédou et al. (2001) | Automated | Ethovision 3.0 |
In the present study, we assessed the relevance of using a longer or shorter interval in the recording method of the FST. To do this, we compared the behavioral results of the FST in the same sample of animals at 3, 5 and 10s recording intervals. We expected to find more accurate results in the shortest interval and a masking effect in the widest one.
MethodAnimalsA total of 10 (4 male/6 female) adult Wistar rats (Rattus norvegicus) between 80 and 150g were used. The animals were obtained from the University of Oviedo central vivarium (Oviedo, Asturias, Spain). They were housed under standard conditions (12-h light/dark cycle with lights on from 08:00 to 20:00h), at constant room temperature of 23±2°C with ad libitum access to food and water. All experimental procedures carried out with animals were approved by a local veterinary committee from the University of Oviedo vivarium and subsequent handling strictly followed the European Communities Council Directive 2010/63/UE. All efforts were made to minimize the number of animals used and their suffering.
ApparatusThe FST was performed in a Plexiglas cylindrical bin (20cm diameter, 50cm high) filled with water (23–27°C water temperature) to a depth of 30cm. Essays were recorded using an automated video-tracking system (Noldus MPEG-4 Recorder V1.1.6). Then, they were burn in a DVD and analyzed by two experimented researchers.
ProcedureRats were placed individually into the Porsolt bin. We used 10min in the first day for the habituation as previously described (Kawai et al., 2012). Next day, test was performed and the animals were placed in the Porsolt bin for 5min, then carefully dried and put it again in their house-cage. Between every subject the water was removed and filled again in order to avoid any smell trail. Essays were analyzed by two experimented researchers. They analyzed animal's behavior (immobility, swimming and climbing) for three different intervals: 3, 5 and 10s. Immobility behavior was considered when the rats was floating passively, making small movements to keep its nose above the water surface. Swimming was measured when the rats made horizontal movements more than those necessary to merely keep the head above the water. Climbing was defined as when the rats was in active vertical motion; trying to escape.
Data analysisAll data were analyzed by SigmaStat 3.2 software (Systat Software, Chicago, USA) and were expressed as mean±SEM. The results were considered statistically significant when p<.05.
For each behavior a one factor ANOVA with three levels were performed (dependent variable: swimming, climbing or immobility; independent variable: 3s interval, 5s interval and 10s interval).
For each interval a one factor ANOVA with three levels were performed (dependent variable: 3s interval, 5s interval or 10s interval; independent variable: swimming, climbing and immobility).
A total of six one factor ANOVA with three levels were performed.
ResultsData were analyzed using one way ANOVA. When significant differences were found, tests for multiple comparisons (Tukey's tests) were used to identify differences. Normality and equal variances are assumed.
T-test for independent samples showed no significant differences between sexes in behavioral results of the FST at 3, 5 and 10s intervals (Table 2).
Mean and standard error mean (SEM) of the different behaviors in male and female rats.
Interval | Behaviors | Male | Female | T-test value |
---|---|---|---|---|
3s | Immobility | 40.83±10.28 | 53.14±6.83 | 0.327 |
Swim | 43.33±10.25 | 31.85±6.75 | 0.355 | |
Climbing | 15.83±4.82 | 15±2.89 | 0.878 | |
5s | Immobility | 56.48±9.36 | 62.65±3.18 | 0.484 |
Swim | 28.70±7.03 | 20.06±4.52 | 0.307 | |
Climbing | 14.81±2.92 | 16.97±2.59 | 0.602 | |
10s | Immobility | 55.55±2.61 | 67.90±5.63 | 0.131 |
Swim | 28.70±2.77 | 15.43±4.62 | 0.064 | |
Climbing | 15.74±3.81 | 16.66±3.13 | 0.856 |
We analyzed whether there were differences in each behavior, according to the different types of intervals used (Table 3). Immobility behavior shows no significant differences between intervals (F(2,27)=2.816; p=.078). Also, we obtain the same results for swim behavior (F(2,27)=3.455; p=.046) and climbing (F(2,27)=0.053; p=.948).
In general, we found significant differences between behaviors, where immobility is greater than swimming and climbing (F(2,99)=9.204; p=.001). Tukey test shown that these differences occur when behavioral observation was performed at 5 and 10s interval (p=.012 and p=.003, respectively), but not for 3s interval (p=.462).
Cohen's kappa coefficient was used to assess agreement between the two judges. This statistic showed substantial agreement (k=0.759), p<.05, 95% CI [0.670–0.847], according to Landis and Koch (1977).
DiscussionBeing aware of the lack of information relative to the use of different observational intervals in the FST, we examined if there are significant differences between the use of 3, 5 or 10s intervals of recording in the behaviors observed in this test: swimming, climbing and immobility. Our results indicate that there are no differences between the recording intervals in any of the behaviors observed.
Despite the fact that most of the researches use only male animals when they assess behavior in the FST, we used both sexes, trying to assess behaviors in a more representative sample. We found no differences between sexes in behavioral results at any interval. This can be easily explained by the fact that our animals are control animals and they are not modeling any disturbance or disease. However, in many animal models of disease both sexes should be analyzed, as sex differences in behaviors exist in some pathological conditions (Bai et al., 2014; Papaioannou, Gerozissis, Prokopiou, Bolaris, & Stylianopoulou, 2002).
We found no differences between the use of 3, 5 or 10s intervals in any of the behaviors observed: swimming, climbing and immobility. Some authors conclude that the smaller the interval, the lower the absolute and relative errors (Wirth et al., 2014), however our results point out a different idea. One reason for this discrepancy could be that Wirth et al. (2014) explore longer intervals than us: they used intervals between 15s and 450s.
In other author's opinion, the length of the recording interval inversely affects the representativeness of the data (Repp et al., 1976). However, in our study we found that in the large recording intervals (10s) and in the standard medium interval used in most of the studies (5s), there are no differences for between behaviors. In both large and medium recording intervals, we found more immobility than swimming and climbing. In our case, the use of the short recording intervals (3s), could be less representative because it might be lead to the loss of data. This loss of information could be due to the excessive atomization of the behaviors that usually occur for more than 3s. Also, we should take into account that our behavioral data are high homogenous, as they are shown by a control sample, while, in a sample with more heterogeneous behaviors, the excessive long intervals, could lead to a loss of information. In an excessive long interval, the animal behavior might change many times. In this case, the recording interval length should be reconsidered.
Our objective in this work was to explore the consequences of using different intervals in the recording of the FST. Although there are many published works about the correct use of methodology in this test (Bogdanova, Kanekar, D’Anci, & Renshaw, 2013; Petit-Demouliere, Chenu, & Bourin, 2005), to the best of our knowledge, there is not any published the effect of the recording interval on the behaviors observed in the test.
The majority of the published works not even describe the interval used (see Table 1). Some authors indicate that they use 5s interval when they analyze behaviors in the FST. The reason for using this interval instead of another is rarely described. However, according to our results, recording at 5s interval seems to be a suitable method.
It is important to mention that we are using a subjective recording method and, for this reason, the observers had to be trained. At the same time, it is highly recommended that behaviors were analyzed by more than one examiner. Regarding examiners’ training, it would be great interest to use more than one animal in order to be able to recognize, behaviors that suffer mild changes between different subjects.
In our study, we used two trained observers and we get a substantial interjudge agreement. This agreement determines a great consistency among raters, and suggests a good implementation of the measurement system.
A limitation of our work is that it was performed with only control animals. The reason for choosing this non-pathological sample is that we wanted to assess the reliability of the interval used by the examiners of behavior without considering any other variable. Normally, this test is widely used for depression models (Kawai et al., 2012; Shaldubina, Einat, Bersudsky, & Belmaker, 2005; Ulloa et al., 2014), or related models, like stress-induced anhedonia (Briones et al., 2012). For this reason, future research should check whether these results are equal for pathological models, or if in these models we need to use a specific recording interval.
We conclude that there are no differences in the use of these three recording intervals (3, 5 and 10s) in the registration method of the forced swimming test, for the main three behaviors measured: climbing, swimming and immobility. Perhaps, in some cases, the use of 3-s recording intervals could drive to a loss of relevant information.
Conflicts of interestAll authors declare that there are no actual or potential conflicts of interest including any financial, personal or other relationships with other people or organizations that could inappropriately influence this work.
This research was supported by Project Grant of the Spanish Ministry of Economy and Competitiveness PSI2013-45924-P and BES-2014-070562 to M.B.L., and Consejería de Economía y Competitividad del Gobierno del Principado de Asturias.