A novel multi-modal recommendation framework based on graph convolutional networks, explicitly modeling modal-specific user preferences to enhance micro-video recommendation.
Yinwei Wei1, Xiang Wang2, Liqiang Nie1*, Xiangnan He3, Richang Hong4, Tat-Seng Chua2
1 Shandong University, China
2 National University of Singapore, Singapore
3 University of Science and Technology of China, China
4 Hefei University of Technology, China
* Corresponding author
- Paper: ACM MM'19
- Code Repository: [`GitHub`](https://github.com/iLearn-Lab/MM19-MMGCN)
- [10/2019] Paper presented at ACM MM'19.
- [01/2020] Initial release of the PyTorch implementation and toy datasets.
This is the official PyTorch implementation for the paper MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video.
Multi-modal Graph Convolution Network is a novel multi-modal recommendation framework based on graph convolutional networks. It explicitly models modal-specific user preferences to enhance micro-video recommendation. In this repository, we provide the updated code and utilize a full-ranking strategy for both validation and testing.
git clone \[https://github.com/iLearn-Lab/MM19-MMGCN.git\](https://github.com/iLearn-Lab/MM19-MMGCN.git)
cd MM19-MMGCN
The code has been tested running under Python 3.5.2. The required packages are as follows:
- Pytorch == 1.1.0
- torch-cluster == 1.4.2
- torch-geometric == 1.2.1
- torch-scatter == 1.2.0
- torch-sparse == 0.4.0
- numpy == 1.16.0
Install the dependencies using pip:
pip install torch==1.1.0 torchvision
pip install torch-scatter==1.2.0 torch-sparse==0.4.0 torch-cluster==1.4.2 torch-geometric==1.2.1
pip install numpy==1.16.0
We provide three processed datasets: Kwai, Tiktok, and Movielens.
Due to copyright restrictions, we cannot release the full datasets directly. You can find the full versions via their official sources: Kwai, Tiktok, and Movielens.
To facilitate this line of research, we provide some toy datasets:
If you need the full datasets, please contact the respective data owners.
| Dataset | #Interactions | #Users | #Items | Visual | Acoustic | Textual |
|---|---|---|---|---|---|---|
| Kwai | 1,664,305 | 22,611 | 329,510 | 2,048 | - | 100 |
| Tiktok | 726,065 | 36,656 | 76,085 | 128 | 128 | 128 |
| Movielens | 1,239,508 | 55,485 | 5,986 | 2,048 | 128 | 100 |
- train.npy: Train file. Each line is a user with her/his positive interactions with items (userID and micro-video ID).
- val.npy: Validation file. Each line is a user several positive interactions with items (userID and micro-video ID).
- test.npy: Test file. Each line is a user with several positive interactions with items (userID and micro-video ID).
The instruction of commands has been clearly stated in the codes. Run the following examples to train the models on different datasets.
Some important arguments:
-
model_name: It specifies the type of model. Here we provide three options:MMGCN(by default) proposed in MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video, ACM MM2019. Usage:--model_name='MMGCN'VBPRproposed in VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback, AAAI2016. Usage:--model_name 'VBPR'ACFproposed in Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention , SIGIR2017. Usage:--model_name 'ACF'GraphSAGEproposed in Inductive Representation Learning on Large Graphs, NIPS2017. Usage:--model_name 'GraphSAGE'NGCFproposed in Neural Graph Collaborative Filtering, SIGIR2019. Usage:--model_name 'NGCF'
-
aggr_modeIt specifics the type of aggregation layer. Here we provide three options:mean(by default) implements the mean aggregation in aggregation layer. Usage--aggr_mode 'mean'maximplements the max aggregation in aggregation layer. Usage--aggr_mode 'max'addimplements the sum aggregation in aggregation layer. Usage--aggr_mode 'add'
-
concat: It indicates the type of combination layer. Here we provide two options:concat(by default) implements the concatenation combination in combination layer. Usage--concat 'True'eleimplements the element-wise combination in combination layer. Usage--concat 'False'
-train.npy
Train file. Each line is a user with her/his positive interactions with items: (userID and micro-video ID)
-val.npy
Validation file. Each line is a user several positive interactions with items: (userID and micro-video ID)
-test.npy
Test file. Each line is a user with several positive interactions with items: (userID and micro-video ID)
Acknowledgement
The copyright for the program is owned by Shandong University.
This program is licensed under the GNU General Public License 3.0. Any derivative work obtained under this license must be licensed under the GNU General Public License as published by the Free Software Foundation, either Version 3 of the License, or (at your option) any later version, if this derivative work is distributed to a third party.
For commercial projects that require the ability to distribute the code of this program as part of a program that cannot be distributed under the GNU General Public License, please contact weiyinwei@hotmail.com to purchase a commercial license.
Sources
1. https://github.com/Liuwq-bit/LightGT
2. https://github.com/Liuwq-bit/LightGT
3. https://github.com/weiyinwei/PHR_GCN
If you want to use our codes and datasets in your research, please cite:
@inproceedings{MMGCN,
title \= {MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video},
author \= {Wei, Yinwei and Wang, Xiang and Nie, Liqiang and He, Xiangnan and Hong, Richang and Chua, Tat-Seng},
booktitle \= {Proceedings of the 27th ACM International Conference on Multimedia},
pages \= {1437--1445},
year \= {2019}
}
