Grouping daily data by month in python/pandas and then normalising
Hi I have the table bellow in a Pandas dataframe:
q_string q_visits q_date
0 nucleus 1790 2012-10-02 00:00:00
1 neuron 364 2012-10-02 00:00:00
2 current 280 2012-10-02 00:00:00
3 molecular 259 2012-10-02 00:00:00
4 stem 201 2012-10-02 00:00:00
The table contains query volume from a server log, by day. I would like to
do 2 things:
(1) I would like to group queries by month summing the query volume of a
query for the whole month e.g. if 'molecular' was present on the
2012-10-02 with volume 1000 and on the 2012-10-03 with volume 500, then it
should have an entry in the new table of 1500 (volume) with date
2012-10-31 (end of the month end-point representing the month - all dates
in the transformed table will be month ends representing the whole month
to which they relate).
(2) I want to add a 5th column which contains the month-normalised
'q_visits' i.e. a term's monthly query volume divided by the total query
volume for the month across all terms.
What is the best way of doing this?
Thanks in advance for any help.
No comments:
Post a Comment