Many times, we need data from Web of Science, but we don’t know the WOS ID. If you know the paper DOI, then it’s easy to solve.
First of all, I recommend you to read these two posts by John Kitchin:
- Accessing web of science entry, citing and related articles from a doi in emacs
- Getting a WOS Accession number from a DOI
As John noted, if the paper’s DOI is 10.1021/jp047349j
, for example, you can directly access the data of Web of Science regarding this paper via this link: http://ws.isiknowledge.com/cps/openurl/service?url_ver=Z39.88-2004&rft_id=info:doi/10.1021/jp047349j. And you can access the information about papers that have cited this paper via this link: http://ws.isiknowledge.com/cps/openurl/service?url_ver=Z39.88-2004&rft_id=info%3Adoi%2F10.1021/jp047349j&svc_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Asch_svc&svc.citing=yes
You can use this pattern for all other papers with different DOIs.
If you have to get the corresponding WOS ID, you can use this Python script I created:
# loading pacakages
import pandas as pd
import csv
import random
import requests
import re
import time
import numpy as np
# Suppose I have these DOIS and I want WOS ID
dois = ['10.1109/TVCG.2006.143',
'10.1109/VISUAL.1997.663902',
'10.1109/TVCG.2006.120',
'10.1109/TVCG.2006.160',
'10.1109/INFVIS.2002.1173156',
'10.1109/VISUAL.1996.568127',
'10.1109/TVCG.2008.137',
'10.1109/VISUAL.1996.568113',
'10.1109/TVCG.2009.122',
'10.1109/VISUAL.1999.809894']
# initialize a list of dictionaries
doi_wos_dict_list = []
def get_wos_id_from_doi(doi):
"""obtain wos id based on paper doi
Arguments:
doi: paper doi
Returns:
The updated doi_wos_dict_list
"""
url = 'http://ws.isiknowledge.com/cps/openurl/service?url_ver=Z39.88-2004&rft_id=info:doi/' + doi
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36",
}
response = requests.get(url=url, headers=headers)
wos_url = response.history[-1].url
wos_id_list = re.findall(r'(?<=2FWOS%3A)(.*)(?=%3F)', wos_url)
# In case that the response does not contain WOS ID:
if wos_id_list:
wos_id = wos_id_list[0]
else:
wos_id = np.NaN
doi_wos_dict = {
'DOI': doi,
'WOS ID': wos_id
}
doi_wos_dict_list.append(doi_wos_dict)
# For each doi in the list, run the function
for doi in dois:
get_wos_id_from_doi(doi)
time.sleep(2+random.uniform(0, 2))
print(f'{dois.index(doi)} is done')
# initiate a dataframe
doi_wos_df_initiate = pd.DataFrame(columns=['DOI', 'WOS ID'])
def build_df_from_dict_list(df, dict_list):
"""build df from a list of dictionaries
Arguments:
df: an empty df you just initiated
dict_list: a list of dictionaries containing data you want to form a df
Returns:
The updated df
"""
for i in dict_list:
df_1 = pd.DataFrame([i])
df = df.append(df_1, ignore_index=True)
return df
doi_wos_df = build_df_from_dict_list(doi_wos_df_initiate, doi_wos_dict_list)
doi_wos_df
will contain the results.
After you know the WOS ID, for example,000270778900093
, you can access this paper’s data via this link: https://www.webofscience.com/wos/woscc/full-record/WOS:000270778900093
Last modified on 2022-01-31