A faster / cheaper / accurate , alternative to
ANNUAL
EMPLOYMENT SURVEY ?
---------------------------------------------------------------
A Proposal
submitted to
* Shri TCA Anant ,
Chief Statistician ,
Ministry of
Statistics and Programme Implementation ,
Government of India ,
------------------------------------------------------------------------------------
Suggested By :
hemen parekh
mumbai / (M) 0 - 98,67,55,08,08
hcp@RecruitGuru.com
www.B2BmessageBlaster.com
27 Oct 2015
-------------------------------------------------------------------------------------
Each job advt database consists of :
Ø
Advt ID
Ø
Designation ( being advertised )
Ø
Company Name ( Advertiser )
Ø
Job Description
Ø
Desired Profile
Ø
Compensation Offered
Ø
Experience ( desired ) – Years
Ø
Industry Type
Ø
Education Quali ( Min )
Ø
Location ( Posting City )
Ø
Keywords
Ø
Advt Posting Date
Ø
Expiry Date
We were displaying PIE-CHARTS of :
Ø
Industry-wise Jobs
Ø
City-wise Jobs
You will observe that , with a much larger database available now , it is possible to analyze / display the “ No of Jobs “ , in many more ways
Not only that , it should be possible to analyze this huge database to predict the future expected PATTERN of the occurrence of jobs , in many different ways !
At any given time , the number of jobs getting advertised , is an important Economic Indicator
If economy is booming and company Order Books are getting fatter , then more jobs will get advertized – and vice-versa
Hence , a time-series analysis of the no of new jobs getting posted on job portals , has a straight line relationship with the state of the economy ( a high co-efficient of correlation )
Apart from that , can a Data mining of 5 million jobs , answer ( even partially ) , the following questions ?
Ø
Who ( which Companies ) are advertizing and when ?
Ø
What jobs / vacancies / positions are being advertized ?
Ø
What is the frequency with which a particular job gets advertized
? By entire industry ? By a given Company ?
Ø
Which regions / cities have max / min no of new jobs ?
Ø
What are regional disparities due to ?
Ø
Which Industries are advertising most – creating most jobs ?
Ø
What Edu Qualifications are in max demand ?
Ø
What kind of jobs demand what kind of Edu Qualifications ?
Ø
What is the level of co-relation between , Position and the years
of Experience demanded ?
Ø
For identical positions being advertized , how much do “ Job
Descriptions / Desired Profiles “ differ, from company to company ?
Ø
Are there significant differences in the “ No of years of
Experience “ being demanded , for identical positions ?
Ø
What is the probability of finding the “ Keywords “ in “ Job
Description / Desired Profile “ ?
Ø
What is the extent of duplication ( redundancy ? ) between , “ Job
Description “ and “ Desired Profile “ ?
Ø
What percentage of Advts fail to make any mention of ,
Compensation Offered ?
Ø
When a company posts an advt for same / identical position , at
different points of time , are there any differences in values ( fields ) ?
Ø
From an analysis of all the advts posted by a given Company ( over
past 7 years ) , can any conclusion be reached as to the changing nature of
that company’s business (by co-relating the “ Skills related Keywords “)?
Ø
Can the algorithm predict what job a company will advertize next –
and when ?
Ø
Is there any correlation between , “ Designation / Position “ and
the “ Keywords “ ?
Ø
From analyzing this huge data , can software auto-generate , a
complete / editable job advt , as soon as a Recruiter simply types the “
Designation / Position “ ?
I believe , so far , no one has undertaken such a Data mining project
If carried out diligently , I am sure , the outcome would be of immense benefit to :
Ø HR
Managers
Ø Recruiting
Managers
Ø Educationists
Ø Students
Ø Planning
Commission ( NITI Aayog )
Ø HRD
Ministry
Ø National
Skills Development Commission
Ministry of Statistics and Programme Implementation ,
on the Centre-Stage of National Education Planning Scenario
What can / will such a
project yield ?
Without exaggerating , it
would be safe to assume that , this vast database of job advts would contain :
Ø 50 million phrases / sentences
Ø 500 million words
Obviously , each word /
phrase / sentence , is nothing more than a
“ Database of Intentions “ of the Employer Companies
( to borrow from John
Battelle’s well-researched book about Google )
Our goal shall be to make
this ( Data mining Algorithm ) a dynamic / continuous “ Process “ , so that ,
we can measure the changing nature of these “ Intentions “ , over a long , long
period
And we must enable a “
Researching Visitor ( of web site ) “, to benefit from these trends / patterns
Even though 5 million job advts may contain 500
million “ words “ , these are not Unique
Most of these are used again and again , hundreds or
thousands of times
Thru data mining , it is not difficult to compute
their “ Frequency of Usage “
And then , these frequencies can be graphically
plotted against any particular time-period
Such Graphical Representations can be further broken
up by ,
Ø
City Names
Ø
Company Names
Ø
Industry Names
Ø
Function Names
Ø
Designations ( Vacancy Names ).. etc
And such graphical analysis can be done , not only
for “ Keywords “ but even for “ Key Phrases “ and “ Sentences “ !
Take a look at this project paper ( NOT ENCLOSED )
It is all about data mining of some 150 million records
( location points ) and about uncovering “ trends / patterns “ of physical
movements of 300 human volunteers , over a “ period of time “
I quote from article in Times of India ( 19 July 2013 )
:
“ ..the first system of its kind to predict long term
human mobility in a unified way , parse the data. " Far Out " does not need to be told exactly what
to look for --- it automatically discovered regularities in the data “
“ Do you know precisely where you’ll be 285 days from
now at 2 pm ?
Researchers have developed a new tracking software that
can tell you exactly where you will be on a precise time and date , years into
the future “
What we want to do with 5 million job advts database ,
is quite similar, viz ;
predict ,
WHO ( which
Company / Industry ) , will advertize
WHAT ( vacancies / positions / designations ), and
WHEN ( time )
I am talking about developing an “ Expert
System “ , thru discovery of specific “ Co-relations “ amongst various Data
Fields of 5 million job advts
Eg :
Ø
What is the Co-relation between , any given
Ø
“ Designation / Vacancy-Name / Advertized Position ,
and
Ø
Educational Qualifications ?
Here are some examples :
Ø
Any designation such as “ Production Manager “ would call
for an “ Engineering Degree / Diploma “ ( but never a CS / CA )
Ø
Any designation in “ Finance Function “ will require,
·
B Com
·
M Com
·
CA etc
But
never a BE(M ) / BE (Chem )
Ø
Any designation at Manager level will call for a minimum
experience of 5 years ( but never a Fresh Graduate with NIL experience )
Ø
MBA / BBA / MMS etc are the most preferred Edu Qualifications for
positions in Marketing
Ø
No vacancy in an Automobile Manufacturing Company , will call for
a degree in Pharmaceutical
Ø
No Electrical Machinery Manufacturing company will ever demand a
Medical Degree (MBBS )
To a human mind , these ( rules ) are so
obvious !
But , no human mind can write-down ALL of
such RULES , in 2 minutes ! – something that your Data mining Software can –
and will – do in 5 seconds !
All that you need , after computing “ Frequencies
of Occurrences “ , is to :
Ø
Plot the Co-efficients of Co-relations between various Fields ( of job advts )
Ø
Compute Probabilities for each and create hundreds of Probability Tables
And , since a thousand new job advts are
getting added to our Job Advt Database , daily , the SAMPLE SIZE is perpetually
increasing – thereby , increasing the Accuracies of your Predictions !
Having done this , imagine the following
scenario :
Recruitment Officer of Wipro , comes to our
“ Post Job “ page and , in the field for “ Designation “ simply types ,
“ Business Analyst “
And Presto !
The entire Job Advt Form gets auto-filled ,
with MOST PROBABLE values !
Would not that amaze her ?
All that our software has done is analyzed
job advts of all “ Software Companies “ ( an Industry ),– and of WIPRO – for
the position of Business Analyst and filled in the most probable values
This is no rocket science !
We had actually , partially attempted it –
albeit in a crude way – in our earlier web site , www.IndiaRecruiter.net
What surprises me is , how come no one has
attempted this so far !
Especially , Naukri / TimesJobs /
MonsterIndia , who have accumulated millions of job advts !
Anyway , the fact that they have , so far ,
ignored this Line of Examination , will
work to the advantage of
Ministry of Statistics and Programme
Implementation
– making YOU the very first person in the entire world to come up with
a PREDICTION MODEL in the area of JOBS
Where is the greatest decline of jobs being advertized ?
How much is the percentage decline ?
Ø
In which Industry ?
Ø
In which Company ?
Ø
In which City ?
Ø
In which Region ?
Ø
In which Skills ?
Ø
For which Positions ?
Ø
For which Education Levels ? ………… etc
One could even co-relate these graphs with other ,
publicly available statistical data such as :
Ø IIP ( Index of Industrial Production )
Ø Stock
Market Index
Ø Currency
Exchange Rate ( eg; declining Rupee )
Ø Decline
in GDP / Increasing Fiscal Deficit
Ø CAD (
Current Account Deficit )
Ø Foreign
Investments
Ø
Primary Bank Rates of RBI…………………………….etc
With proper co-relations , one could even predict how much the job market will further shrink , over the next 6 months ! or grow ?
Such” Predictive Model of Job Market “, would be of immense
interest to , not only the economists but also to the HRD Ministry / Planning
Commission / Educational Institutions and of course the students themselves
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Make
Yourself Heard
Dear Visitor :
It is time for YOU to speak up - and demand that YOU
are heard
By emailing this suggestion - incorporating your OWN
improvements - to the following Policy Makers
( Just multiple
copy all the following Email IDs into the Recipient column of your Outlook and
Copy / paste this suggestion in the Message Box ) :
narendramodi1234@gmail.com;
38ashokroad@gmail.com;ajaitley@del5.vsnl.net.in;
jayant.sinha19@sansad.nic.in;piyush.goyal@gov.in;nitin.gadkari@nic.in;
spprabhu1@rediffmail.com;bandaru@sansad.nic.in;smritizirani@nic.in;nsitharaman@nic.in;
ravis@sansad.nic.in;sureshprabhu@irctc.co.in;prakash.j@sansad.nic.in;secy-ipp@nic.in;
amitabh.kant@nic.in; shaktikanta.das@nic.in; rsecy@nic.in; adhia1981@gmail.com;
ceo-niti@nic.in; atul-chaturvedi@nic.in; uma.bharati@sansad.nic.in;
d.pradhan@sansad.nic.in; rprudy.office@gmail.com; rudypr@rediffmail.com;
eam@mea.gov.in; mosvks@mea.gov.in; mvnaidu@sansad.nic.in; mnaqvi@sansad.nic.in;
rao.inderjit@sansad.nic.in; mljoffice@gov.in; sadananda.gowda@sansad.nic.in;
ministerminority@gov.in; najmah@sansad.nic.in; minister-ca@nic.in;
ramvilas.paswan@sansad.nic.in; gandhim@sansad.nic.in;
akumar-alpha@sansad.nic.in; jpnadda@sansad.nic.in; hm.moca@nic.in;
geete@sansad.nic.in; simratbadal@yahoo.com; ns.tomar@nic.in;
rmsingh@sansad.nic.in; tc.gehlot@sansad.nic.in;
dr.harshvardhan@sansad.nic.in