【2018-Airbnb】Applying Deep Learning To Airbnb Search

关于

摘要与导言

模型演化

SimpleNN

Lambdarank NN

def apply_discount(x):
    '''Apply positional discount curve''' 
    return np.log(2.0)/np.log(2.0 + x)

def compute_weights(logit_op, session): 
    '''Compute loss weights based on delta ndcg.
    logit_op is a [BATCH_SIZE, NUM_SAMPLES] shaped tensor corresponding to the output layer of the network.
    Each row corresponds to a search and each
    column a listing in the search result. Column 0 is the booked listing, while columns 1 through
    NUM_SAMPLES - 1 the not-booked listings. '''
    logit_vals = session.run(logit_op) 
    ranks = NUM_SAMPLES - 1 - logit_vals.argsort(axis=1).argsort(axis=1) 
    discounted_non_booking = apply_discount(ranks[:, 1:]) 
    discounted_booking = apply_discount(np.expand_dims(ranks[:, 0], axis=1)) 
    discounted_weights = np.abs(discounted_booking - discounted_non_booking) 
    return discounted_weight

# Compute the pairwise loss
pairwise_loss = tf.nn.sigmoid_cross_entropy_with_logits( 
    targets=tf.ones_like(logit_op[:, 0]), 
    logits=logit_op[:, 0] - logit_op[:, i:] )
# Compute the lambdarank weights based on delta ndcg
weights = compute_weights(logit_op, session)
# Multiply pairwise loss by lambdarank weights
loss = tf.reduce_mean(tf.multiply(pairwise_loss, weights))

Decision Tree/Factorization Machine NN

Deep NN

错误的模型

Listing ID

Multi-task learning

Xing Yi, Liangjie Hong, Erheng Zhong, Nanthan Nan Liu, and Suju Rajan. 2014. Beyond Clicks: Dwell Time for Personalization. In Proceedings of the 8th ACM Conference on Recommender Systems (RecSys ’14). ACM, New York, NY, USA,

特征工程

幂率分布归一化

DNN输出分布
经纬度的变换

入住率

位置偏好

系统工程

超参数

特征重要性

TopBot Analysis

参考

  1. https://developers.google.com/machine-learning/guides/rules-of-ml/
  2. Daria Sorokina and Erick Cantu-Paz. 2016. Amazon Search: The Joy of Rank- ing Products. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’16). 459–460.
  3. Peng Ye, Julian Qian, Jieying Chen, Chen-Hung Wu, Yitong Zhou, Spencer De Mars, Frank Yang, and Li Zhang. 2018. Customized Regression Model for Airbnb Dynamic Pricing. In Proceedings of the 24th ACM SIGKDD Conference on Knowl- edge Discovery and Data Mining.
  4. Sebastian Ruder. 2017. An Overview of Multi-Task Learning in Deep Neural Networks. CoRR abs/1706.05098 (2017). arXiv:1706.05098 http://arxiv.org/abs/1706.05098
  5. Xing Yi, Liangjie Hong, Erheng Zhong, Nanthan Nan Liu, and Suju Rajan. 2014. Beyond Clicks: Dwell Time for Personalization. In Proceedings of the 8th ACM Conference on Recommender Systems (RecSys ’14). ACM, New York, NY, USA,
  6. Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2017.Understandingdeeplearningrequiresrethinkinggeneralization. https: //arxiv.org/abs/1611.03530