TraMineR для данных об использовании времени с более чем сотней различных видов деятельности

Я пытаюсь анализировать последовательности с помощью TraMineR

UKTUS записывает действия людей за каждые 10 минут, определенные в наборе данных как переменные act1_1, act1_2, ..., act1_144 (144 x 10 минут).

Каждый временной шаг (act1_1, act1_2, act1_3) определяется одним из следующих кодов:

-1 Not applicable
           0 Unspecified personal care
         110 Sleep
         111 Sleep: In bed not asleep
         120 Sleep: Sick in bed
         210 Eating
         300 Other personal care: Unspecified other personal care
         310 Other personal care: Wash and dress
         390 Other personal care: Other specified personal care
        1000 Unspecified employment
        1100 Main job: unspecified main job
        1110 Main job: Working time in main job
        1120 Main job: Coffee and other breaks in main job
        1210 Second job: Working time in second job
        1220 Second job: Coffee and other breaks in second job
        1300 Activities related to employment: Unspecified activities related to employment
        1310 Activities related to employment: Lunch break
        1390 Activities related to employment: Other specified activities related to employment
        1391 Activities related to employment: Activities related to job seeking
        1399 Activities related to employment: Other unspecified activities related to employment
        2000 Study: Unspecified study school or university
        2100 Study: Unspecified activities related to school or university
        2110 Study: Classes and lectures
        2120 Study: Homework
        2190 Study: other specified activities related to school or university
        2210 Free time study
        3000 Unspecified household and family care
        3100 Unspecified food management
        3110 Food preparation and baking
        3130 Dish washing
        3140 Preserving
        3190 Other specified food management
        3200 Unspecified household upkeep
        3210 Cleaning dwelling
        3220 Cleaning yard
        3230 Heating and water
        3240 Arranging household goods and materials
        3250 Disposal of waste
        3290 Other or unspecified household upkeep
        3300 Unspecified making and care for textiles
        3310 Laundry
        3320 Ironing
        3330 Handicraft and producing textiles
        3390 Other specified making and care for textiles
        3410 Gardening
        3420 Tending domestic animals
        3430 Caring for pets
        3440 Walking the dog
        3490 Other specified gardening and pet care
        3500 Unspecified construction and repairs
        3510 House construction and renovation
        3520 Repairs of dwelling
        3530 Making repairing and maintaining equipment
        3531 Woodcraft metalcraft sculpture and pottery
        3539 Other specified making repairing and maintaining equipment
        3540 Vehicle maintenance
        3590 Other specified construction and repairs
        3600 Unspecified shopping and services
        3610 Unspecified shopping
        3611 Shopping mainly for food
        3612 Shopping mainly for clothing
        3613 Shopping mainly related to accommodation
        3614 Shopping or browsing at car boot sales or antique fairs
        3615 Window shopping or other shopping as leisure
        3619 Other specified shopping
        3620 Commercial and administrative services
        3630 Personal services
        3690 Other specified shopping and services
        3710 Household management not using the internet
        3713 Shopping for and ordering clothing via the internet
        3720 Unspecified household management using the internet
        3721 Shopping for and ordering unspecified goods and services via the internet
        3722 Shopping for and ordering food via the internet
        3724 Shopping for and ordering goods and services related to accommodation via the internet
        3725 Shopping for and ordering mass media via the internet
        3726 Shopping for and ordering entertainment via the internet
        3727 Banking and bill paying via the internet
        3729 Other specified household management using the internet
        3800 Unspecified childcare
        3810 Unspecified physical care & supervision of a child
        3811 Feeding the child
        3819 Other and unspecified physical care & supervision of a child
        3820 Teaching the child
        3830 Reading playing and talking with child
        3840 Accompanying child
        3890 Other or unspecified childcare
        3910 Unspecified help to a non-dependent eg injured adult household member
        3911 Physical care of a non-dependent e.g. injured adult household member
        3914 Accompanying a non-dependent adult household member e.g. to hospital
        3919 Other specified help to a non-dependent adult household member
        3920 Unspecified help to a dependent adult household member
        3921 Physical care of a dependent adult household member e.g. Alzheimic parent
        3924 Accompanying a dependent adult household member e.g. Alzheimic
        3929 Other specified help to a dependent adult household member
        4000 Unspecified volunteer work and meetings
        4100 Unspecified organisational work
        4110 Work for an organisation
        4120 Volunteer work through an organisation
        4190 Other specified organisational work
        4200 Unspecified informal help to other households
        4210 Food management as help to other households
        4220 Household upkeep as help to other households
        4230 Gardening and pet care as help to other households
        4240 Construction and repairs as help to other households
        4250 Shopping and services as help to other households
        4260 Help to other households in employment and farming
        4270 Unspecified childcare as help to other households
        4271 Physical care and supervision of child as help to other household
        4272 Teaching non-coresident child
        4273 Reading playing & talking to non-coresident child
        4274 Accompanying non-coresident child
        4275 Physical care and supervision of own child as help to other household
        4277 Reading playing & talking to own non-coresident child
        4278 Accompanying own non-coresident child
        4279 Other specified childcare as help to other household
        4280 Unspecified help to an adult of another household
        4281 Physical care and supervision of an adult as help to another household
        4282 Accompanying an adult as help to another household
        4283 Other specified help to an adult member of another household
        4289 Other specified informal help to another household
        4290 Other specified informal help
        4300 Unspecified participatory activities
        4310 Meetings
        4320 Religious activities
        4390 Other specified participatory activities
        5000 Unspecified social life and entertainment
        5100 Unspecified social life
        5110 Socialising with family
        5120 Visiting and receiving visitors
        5130 Celebrations
        5140 Telephone conversation
        5190 Other specified social life
        5200 Unspecified entertainment and culture
        5210 Cinema
        5220 Unspecified theatre or concerts
        5221 Plays musicals or pantomimes
        5222 Opera operetta or light opera
        5223 Concerts or other performances of classical music
        5224 Live music other than classical concerts opera and musicals
        5225 Dance performances
        5229 Other specified theatre or concerts
        5230 Art exhibitions and museums
        5240 Unspecified library
        5241 Borrowing books records audiotapes videotapes CDs VDs etc. from a library
        5242 Reference to books and other library materials within a library
        5243 Using internet in the library
        5244 Using computers in the library other than internet use
        5245 Reading newspapers in a library
        5249 Other specified library activities
        5250 Sports events
        5290 Other unspecified entertainment and culture
        5291 Visiting a historical site
        5292 Visiting a wildlife site
        5293 Visiting a botanical site
        5294 Visiting a leisure park
        5295 Visiting an urban park playground designated play area
        5299 Other or unspecified entertainment or culture
        5310 Resting - Time out
        6000 Unspecified sports and outdoor activities
        6100 Unspecified physical exercise
        6110 Walking and hiking
        6111 Taking a walk or hike that lasts at least miles or 1 hour
        6119 Other walk or hike
        6120 Jogging and running
        6130 Biking skiing and skating
        6131 Biking
        6132 Skiing or skating
        6140 Unspecified ball games
        6141 Indoor pairs or doubles games
        6142 Indoor team games
        6143 Outdoor pairs or doubles games
        6144 Outdoor team games
        6149 Other specified ball games
        6150 Gymnastics
        6160 Fitness
        6170 Unspecified water sports
        6171 Swimming
        6179 Other specified water sports
        6190 Other specified physical exercise
        6200 Unspecified productive exercise
        6210 Hunting and fishing
        6220 Picking berries mushroom and herbs
        6290 Other specified productive exercise
        6310 Unspecified sports related activities
        6311 Activities related to sports
        6312 Activities related to productive exercise
        7000 Unspecified hobbies games and computing
        7100 Unspecified arts
        7110 Unspecified visual arts
        7111 Painting drawing or other graphic arts
        7112 Making videos taking photographs or related photographic activities
        7119 Other specified visual arts
        7120 Unspecified performing arts
        7121 Singing or other musical activities
        7129 Other specified performing arts
        7130 Literary arts
        7140 Other specified arts
        7150 Unspecified hobbies
        7160 Collecting
        7170 Correspondence
        7190 Other specified or unspecified arts and hobbies
        7220 Computing - programming
        7230 Unspecified information by computing
        7231 Information searching on the internet
        7239 Other specified information by computing
        7240 Unspecified communication by computer
        7241 Communication on the internet
        7249 Other specified communication by computing
        7250 Unspecified other computing
        7251 Skype or other video call
        7259 Other specified computing
        7300 Unspecified games
        7310 Solo games and play
        7320 Unspecified games and play with others
        7321 Billiards pool snooker or petanque
        7322 Chess and bridge
        7329 Other specified parlour games and play
        7330 Computer games
        7340 Gambling
        7390 Other specified games
        8000 Unspecified mass media
        8100 Unspecified reading
        8110 Reading periodicals
        8120 Reading books
        8190 Other specified reading
        8210 Unspecified TV video or DVD watching
        8211 Watching a film on TV
        8212 Watching sport on TV
        8219 Other specified TV watching
        8220 Unspecified video watching
        8221 Watching a film on video
        8222 Watching sport on video
        8229 Other specified video watching
        8300 Unspecified listening to radio and music
        8310 Unspecified radio listening
        8311 Listening to music on the radio
        8312 Listening to sport on the radio
        8319 Other specified radio listening
        8320 Listening to recordings
        9000 Travel related to unspecified time use
        9010 Travel related to personal business
        9100 Travel to/from work
        9110 Travel in the course of work
        9120 Travel to work from home and back only
        9130 Travel to work from a place other than home
        9210 Travel related to education
        9230 Travel escorting to/ from education
        9310 Travel related to household care
        9360 Travel related to shopping
        9370 Travel related to services
        9380 Travel escorting a child other than education
        9390 Travel escorting an adult other than education
        9400 Travel related to organisational work
        9410 Travel related to voluntary work and meetings
        9420 Travel related to informal help to other households
        9430 Travel related to religious activities
        9440 Travel related to participatory activities other than religious activities
        9500 Travel to visit friends/relatives in their homes not respondents household
        9510 Travel related to other social activities
        9520 Travel related to entertainment and culture
        9600 Travel related to other leisure
        9610 Travel related to physical exercise
        9620 Travel related to hunting & fishing
        9630 Travel related to productive exercise other than hunting & fishing
        9710 Travel related to gambling
        9720 Travel related to hobbies other than gambling
        9800 Travel related to changing locality
        9810 Travel to holiday base
        9820 Travel for day trip/just walk
        9890 Other specified travel
        9940 Punctuating activity
        9941 Unknown: at home
        9950 Filling in the time use diary
        9960 No main activity no idea what it might be
        9970 No main activity some idea what it might be
        9980 Illegible activity
        9990 Unspecified time use
        9999 Queryable

Я создал матрицу в R со 129 столбцами и 16533 строками.

Activities <-uktus15_diary_wide[,c ("serial", "pnum","ddayw","DVAge", "dmonth", "dyear","WhenDiary","AfterDiaryDay","WhereStart","WhereEnd","RushedD","Ordinary","KindOfDay","Trip","enjm1","act1_1, "act1_2", "act1_3", "act1_4", "act1_5", "act1_6", "act1_7", "act1_8", "act1_9", "act1_10",                               "act1_11", "act1_12", "act1_13", "act1_14","act1_15", "act1_16", "act1_17", "act1_18", "act1_19", "                                   "act1_21", "act1_22", "act1_23", "act1_24", "act1_25", "act1_26", "act1_27", "act1_28", "act1_29", "act1_30",
                                    "act1_31", "act1_32", "act1_33", "act1_34", "act1_35", "act1_36", "act1_37", "act1_38", "act1_39", "act1_40",
                                    "act1_41", "act1_42", "act1_43", "act1_44", "act1_45", "act1_46", "act1_47", "act1_48", "act1_49", "act1_50",
                                    "act1_51", "act1_52", "act1_53", "act1_54", "act1_55", "act1_56", "act1_57", "act1_58", "act1_59", "act1_60",
                                    "act1_61", "act1_62", "act1_63", "act1_64", "act1_65", "act1_66", "act1_67", "act1_68", "act1_69", "act1_70",
                                    "act1_71", "act1_72", "act1_73", "act1_74", "act1_75", "act1_76", "act1_77", "act1_78", "act1_79", "act1_80",
                                    "act1_81", "act1_82", "act1_83", "act1_84", "act1_85", "act1_86", "act1_87", "act1_88", "act1_89", "act1_90",
                                    "act1_91", "act1_92", "act1_93", "act1_94", "act1_95", "act1_96", "act1_97", "act1_98", "act1_99", "act1_100",
                                    "act1_101", "act1_102", "act1_103", "act1_104", "act1_105", "act1_106", "act1_107", "act1_108", "act1_109",
                                    "act1_110", "act1_111", "act1_112", "act1_113", "act1_114")]

Пример того, как выглядят данные (я включил только последовательный pnum act1_28 (активность между 8: 30-8: 40), act1_29 (активность между 8: 40-8: 50 и act1_30 (активность между 8: 50-0: 90 переменными)

serial  pnum  act1_28   act1_29 act1_30 
11011202    1   3110    3110    3110    
11011202    2   3310    3310    7241
11011202    4   9210    9210    9210

Мой вопрос: можем ли мы использовать TraMineR в этом случае для майнинга последовательностей? Можно ли определить act1_1, act1_2 ... act1_144 как последовательность? Можно ли использовать коды активности для определения состояний?

person RforDummies    schedule 26.09.2018    source источник
Показанные фрагменты кода не очень полезны. Я предлагаю вам привести пример с минимальными данными с, скажем, 3 людьми и действиями в течение 3 периодов по 10 минут. Кроме того, я предполагаю, что ваша матрица действий имеет 159 столбцов (144 периода времени + 15 первых переменных), а не 129!   -  person Gilbert    schedule 27.09.2018
@Gilbert Спасибо за вашу помощь; Я обновил свой вопрос. Как вы думаете, я могу использовать TraMineR?   -  person RforDummies    schedule 28.09.2018

Ответы (1)

Вот как вы создаете объект последовательности состояний из ваших данных примера

dat <- matrix(c(11011202, 1, 3110, 3110, 3110,    
         11011202, 2, 3310, 3310, 7241,
         11011202, 4, 9210, 9210, 9210),
         nrow=3, ncol=5, byrow=TRUE)
names(dat) <- c("serial",  "pnum", "act1_28", "act1_29", "act1_30")

## creating the state sequence object
s <- seqdef(dat[,3:5])


введите здесь описание изображения

Здесь задействованы только четыре разных состояния.

Однако алфавит для всего вашего набора данных намного больше. Некоторые функции TraMineR либо не будут работать, либо не будут давать полезных результатов, если алфавит содержит слишком много состояний. Например, хронограммы или графики индекса с таким большим количеством различных состояний будут нечитаемыми. Более того, TraMineR автоматически назначает цвета состояниям только до тех пор, пока размер алфавита не превышает 12. Конечно, многие функции, такие как вычисление сложности отдельных последовательностей (seqici), последовательности поперечных распределений на следующих друг за другом временные интервалы и их энтропии (seqstatd) или даже вычисление парных несходств (seqdist) должны работать с большими алфавитами.

Тем не менее, чтобы использовать TraMineR, я настоятельно рекомендую вам подумать о перекодировании ваших состояний, чтобы резко сократить ваш алфавит до менее 20 различных состояний.

person Gilbert    schedule 30.09.2018
Спасибо Гилберт - person RforDummies; 01.10.2018
Уважаемый Гилберт, как мне уменьшить количество состояний? - person RforDummies; 12.10.2018
Я бы посоветовал начать с рассмотрения только первой цифры ваших кодов активности. 0 - это сон и уход за собой, 1 - работа, 2 - учеба, 3 - уход за домом, .... - person Gilbert; 14.10.2018
Дорогой Гилберт! Спасибо за помощь. Другой вопрос, пожалуйста - как вы думаете, как это повлияет на последовательности, если я удалю коды активности, которые не имеют отношения к конкретному поведению. Например, если я исследую деятельность, имеющую отношение к энергии, и я перекодирую свои данные, используя 0 и 1 (поэтому 1 будет относиться к деятельности, относящейся к энергии, а 0 - к деятельности, не относящейся к энергии), как это повлияет на последовательность? Спасибо - person RforDummies; 15.10.2018