Campus Recruitment¶
1. Dataset Summary¶
This dataset is about academic and employability factors influencing placement which happens at the end of MBA, before graduation. The students are recruited based on their past academic performance, choice of specialization, placement test score, as well as work experience. The dataset consists of several categorical features such as gender, school board, specialization, and degree type. The numerical features are the student's performance in secondary, higher secondary school, college, and MBA, and salary offered.
The features are:
- gender
- ssc_p = Secondary Education percentage- 10th Grade
- ssc_b = Board of Education- Central/ Others
- hsc_p = Higher Secondary Education percentage- 12th Grade
- hsc_b = Board of Education- Central/ Others
- hsc_s = Specialization in Higher Secondary Education
- degree_p = Degree Percentage
- degree_t = Under Graduation(Degree type)- Field of degree education
- workex = Work Experience
- etest_p = Employability test percentage ( conducted by college)
- specialisation = Post Graduation(MBA)- Specialization
- mba_p = MBA percentage
- status = Status of placement- Placed/Not placed
- salary = Salary offered by corporate to candidates
sl_no | gender | ssc_p | ssc_b | hsc_p | hsc_b | hsc_s | degree_p | degree_t | workex | etest_p | specialisation | mba_p | status | salary | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | M | 67.00 | Others | 91.00 | Others | Commerce | 58.00 | Sci&Tech | No | 55.0 | Mkt&HR | 58.80 | Placed | 270000.0 |
1 | 2 | M | 79.33 | Central | 78.33 | Others | Science | 77.48 | Sci&Tech | Yes | 86.5 | Mkt&Fin | 66.28 | Placed | 200000.0 |
2 | 3 | M | 65.00 | Central | 68.00 | Central | Arts | 64.00 | Comm&Mgmt | No | 75.0 | Mkt&Fin | 57.80 | Placed | 250000.0 |
3 | 4 | M | 56.00 | Central | 52.00 | Central | Science | 52.00 | Sci&Tech | No | 66.0 | Mkt&HR | 59.43 | Not Placed | NaN |
4 | 5 | M | 85.80 | Central | 73.60 | Central | Commerce | 73.30 | Comm&Mgmt | No | 96.8 | Mkt&Fin | 55.50 | Placed | 425000.0 |
It appears that the features in our dataset are already clean and have no missing values, except for the salary column because those are available only for candidates who passed the placement.
features | data_type | nan_total | nan_pct | unique | values_ex | |
---|---|---|---|---|---|---|
0 | sl_no | int64 | 0 | 0.00 | 215 | [48, 149] |
1 | gender | object | 0 | 0.00 | 2 | [M, F] |
2 | ssc_p | float64 | 0 | 0.00 | 103 | [43.0, 69.6] |
3 | ssc_b | object | 0 | 0.00 | 2 | [Central, Others] |
4 | hsc_p | float64 | 0 | 0.00 | 97 | [74.0, 71.98] |
5 | hsc_b | object | 0 | 0.00 | 2 | [Central, Others] |
6 | hsc_s | object | 0 | 0.00 | 3 | [Arts, Science, Commerce] |
7 | degree_p | float64 | 0 | 0.00 | 89 | [85.0, 56.2] |
8 | degree_t | object | 0 | 0.00 | 3 | [Comm&Mgmt, Others, Sci&Tech] |
9 | workex | object | 0 | 0.00 | 2 | [No, Yes] |
10 | etest_p | float64 | 0 | 0.00 | 100 | [71.2, 88.0] |
11 | specialisation | object | 0 | 0.00 | 2 | [Mkt&HR, Mkt&Fin] |
12 | mba_p | float64 | 0 | 0.00 | 205 | [56.11, 64.27] |
13 | status | object | 0 | 0.00 | 2 | [Placed, Not Placed] |
14 | salary | float64 | 67 | 31.16 | 45 | [216000.0, 255000.0] |
2. Candidates Profiling¶
2.1 How many of the candidates passed the job placement?¶
Most of the candidates got placed. Almost 70% of the candidates were offered the job.
2.2 How was the difference in job placement based on gender?¶
There are more male candidates compared with female candidates. Male candidates made up more than 60% of total candidates.
More females were not getting placed compared with their male counterparts. As much as 36.8% of the female candidates were turned down from the job while only 28% of male candidates were turned down.
status | % Not Placed | % Placed |
---|---|---|
gender | ||
Female | 36.842105 | 63.157895 |
Male | 28.057554 | 71.942446 |
3. Candidate Education History during Secondary and Higher Secondary School¶
3.1 How did the candidates' performance in secondary and higher secondary school affect their placement?¶
Candidates who passed the job placement were reported as having higher performance during their secondary education on the 10th grade. Their average score was 71.7%, far higher than those who did not pass who had average score of only 57.5%.
ssc_p | |
---|---|
status | |
Not Placed | 57.544030 |
Placed | 71.721486 |
Candidates who passed the job recruitment also had better higher secondary education performance, as shown by their average score of 69.9%, more than 10 points higher than those who did not pass.
hsc_p | |
---|---|
status | |
Not Placed | 58.395522 |
Placed | 69.926554 |
It could be concluded that high performance during secondary and higher secondary education are an indicator of whether the candidate will pass for the job.
3.2 Did the secondary and higher secondary school board affect candidates' placement?¶
There are several school boards in India, each has different education focus thus might have differences in curriculum. For example, the most popular school board, Central Board of Secondary Education (CBSE), focuses on preparing students for Science and Math related subjects. Other boards put different emphasis on science, arts and language subjects. This dataset groups the school board into two: (1) Central
or CBSE, and (2) Others
which are school boards other than CBSE.
From the chart below, most of the candidates came from secondary and higher secondary education under CBSE. However, it seems that there the type of school board the students attended for their secondary and higher secondary school, since the ratio of students who got the job were similar for both categories, 67.2% for candidates with Central Board education and 70.7% for candidates not from Central Board education. We could do hypothesis testing to compare the passed candidate proportion difference between the Central
and Others
group.
status | % Not Placed | % Placed |
---|---|---|
ssc_b | ||
Central | 32.758621 | 67.241379 |
Others | 29.292929 | 70.707071 |
status | % Not Placed | % Placed |
---|---|---|
hsc_b | ||
Central | 32.142857 | 67.857143 |
Others | 30.534351 | 69.465649 |
Hypothesis testing
H0:
(proportion of candidates who passed from Central board is the same with proportion of candidates who passed from Other boards)
H1:
(proportion of candidates who passed from Central board is not the same with proportion of candidates who passed from Other boards)
import scipy.stats.distributions as dist
def two_proportions(column, sample1, sample2, value='Placed'):
# Sample sizes
n1 = len(df_train[df_train[column] == sample1])
n2 = len(df_train[df_train[column] == sample2])
y1 = len(df_train[(df_train[column] == sample1) & (df_train.status == value)])
y2 = len(df_train[(df_train[column] == sample2) & (df_train.status == value)])
# Estimates of the population proportions
p1 = round(y1 / n1, 2)
p2 = round(y2 / n2, 2)
# Estimate of the combined population proportion
phat = (y1 + y2) / (n1 + n2)
# Estimate of the variance of the combined population proportion
va = phat * (1 - phat)
# Estimate of the standard error of the combined population proportion
# standard error of estimate
se = np.sqrt(va * (1 / n1 + 1 / n2))
# Test statistic and its p-value
test_stat = (p1 - p2) / se
pvalue = 2*dist.norm.cdf(-np.abs(test_stat)) # two-tailed test!!
# Print the test statistic its p-value
print('alpha : 0.05')
print("\nTest Statistic")
print(round(test_stat, 2))
print("\nP-Value")
print(round(pvalue, 2))
Hypothesis testing on ssc_b
(Secondary education school board):
two_proportions(column='ssc_b', sample1='Central', sample2='Others', value='Placed')
alpha : 0.05 Test Statistic -0.63 P-Value 0.53
There is not enough evidence to reject H0, which means the proportion of candidates who passed from Central
board is the same with proportion of candidates who passed from Others
.
Hypothesis testing on hsc_b (Higher secondary education school board)
two_proportions(column='hsc_b', sample1='Central', sample2='Others', value='Placed')
alpha : 0.05 Test Statistic -0.15 P-Value 0.88
There is not enough evidence to reject H0, which means the proportion of candidates who passed from Central board is the same with proportion of candidates who passed from Others.
3.3 Did the placement favors a certain specialization of candidates in high secondary school?¶
Most of the candidates had Commerce as their specialization in high secondary school, followed by Science, and Arts. Commerce and Science had the same proportions of candidates who passed the placement, which was around 69%.
status | % Not Placed | % Placed |
---|---|---|
hsc_s | ||
Arts | 45.454545 | 54.545455 |
Commerce | 30.088496 | 69.911504 |
Science | 30.769231 | 69.230769 |
4. Candidate Education History during Undergraduate School¶
4.1 How did the candidates' performance in undergraduate affect their placement?¶
Candidates who passed the job placement were reported as having higher performance during their undergraduate study. Their average score was 68.7%, slightly higher than those who did not pass who had average score of only 61.1%.
degree_p | |
---|---|
status | |
Not Placed | 61.134179 |
Placed | 68.740541 |
4.2 What was the most popular degree among the candidates and the candidates who passed the placement?¶
This dataset lists only three types of undergraduate degree, which are Science & Technology, Communication & Management, and others. As expected, most of the candidates had undergraduate degree in Communication & Management, followed by Science & Technology.
Candidates with undergraduate degree in Comm&Mgmt had similar chance of getting placed (70.3%) compared with those having degree in Sci&Tech (69.5%).
status | % Not Placed | % Placed |
---|---|---|
degree_t | ||
Comm&Mgmt | 29.655172 | 70.344828 |
Others | 54.545455 | 45.454545 |
Sci&Tech | 30.508475 | 69.491525 |
5. Candidate Education and Work History during Post-Graduation and MBA School¶
5.1 How many of the candidates had work experience?¶
Most of the candidates did not have any work experience, but it is clear that having work experience would favor them in getting the job. About 86.5% of the candidates with work experience were accepted, a proportion far higher than those without work experience.
status | % Not Placed | % Placed |
---|---|---|
workex | ||
Experienced | 13.513514 | 86.486486 |
Not Experienced | 40.425532 | 59.574468 |
5.2 Did the candidates performance in MBA affect the placement?¶
Surprisingly, there was no difference in performance during MBA school between those who passed and those who did not.
5.3 What were the specializations taken by candidates during MBA?¶
There are two specialization types in this dataset, (1) Marketing & Finance, and (2) Marketing & Human Resources. Most of the candidates came from Mkt&Fin specialization. A higher portion of Mkt&Fin candidates passed the placement (79.2%), while only 55.8% of candidates from Mkt&HR passed. This might suggests that the job application is for a role related to Marketing & Finance field.
status | % Not Placed | % Placed |
---|---|---|
specialisation | ||
Mkt&Fin | 20.833333 | 79.166667 |
Mkt&HR | 44.210526 | 55.789474 |
6. Candidate Performance in Placement Test¶
The chart below shows that there was no difference of test placement score between candidates who passed and not. This suggests that the placement test played small role in deciding whether a candidate qualify for the job or not.
features | data_type | nan_total | nan_pct | unique | values_ex | |
---|---|---|---|---|---|---|
0 | sl_no | int64 | 0 | 0.00 | 215 | [55, 174] |
1 | gender | object | 0 | 0.00 | 2 | [Female, Male] |
2 | ssc_p | float64 | 0 | 0.00 | 103 | [48.0, 61.08] |
3 | ssc_b | object | 0 | 0.00 | 2 | [Central, Others] |
4 | hsc_p | float64 | 0 | 0.00 | 97 | [67.2, 72.8] |
5 | hsc_b | object | 0 | 0.00 | 2 | [Others, Central] |
6 | hsc_s | object | 0 | 0.00 | 3 | [Commerce, Arts, Science] |
7 | degree_p | float64 | 0 | 0.00 | 89 | [63.35, 54.38] |
8 | degree_t | object | 0 | 0.00 | 3 | [Sci&Tech, Others, Comm&Mgmt] |
9 | workex | object | 0 | 0.00 | 2 | [Not Experienced, Experienced] |
10 | etest_p | float64 | 0 | 0.00 | 100 | [74.28, 93.4] |
11 | specialisation | object | 0 | 0.00 | 2 | [Mkt&HR, Mkt&Fin] |
12 | mba_p | float64 | 0 | 0.00 | 205 | [64.34, 66.06] |
13 | status | object | 0 | 0.00 | 2 | [Not Placed, Placed] |
14 | salary | float64 | 67 | 31.16 | 45 | [260000.0, 393000.0] |
7. Conclusions and Suggestions¶
By far, the most obvious feature that decides candidate's placement is workex
or whether a candidate has a work experience or not. On the other hand, this is only based on comparing candidate pass rate between the categorical features, but we have not compared the numerical features in more detail.
The next step for analysis is to include the categorical and numerical features and group them into three: (1) Candidate secondary & higher secondary education history, (2) Candidate undergraduate history, and (3) Candidate post-graduate and MBA history.