Download the PHP package jenutka/titanic_php without Composer

On this page you can find all versions of the php package jenutka/titanic_php. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.

FAQ

After the download, you have to make one include require_once('vendor/autoload.php');. After that you have to import the classes with use statements.

Example:
If you use only one package a project is not needed. But if you use more then one package, without a project it is not possible to import the classes with use statements.

In general, it is recommended to use always a project to download your libraries. In an application normally there is more than one library needed.
Some PHP packages are not free to download and because of that hosted in private repositories. In this case some credentials are needed to access such packages. Please use the auth.json textarea to insert credentials, if a package is coming from a private repository. You can look here for more information.

  • Some hosting areas are not accessible by a terminal or SSH. Then it is not possible to use Composer.
  • To use Composer is sometimes complicated. Especially for beginners.
  • Composer needs much resources. Sometimes they are not available on a simple webspace.
  • If you are using private repositories you don't need to share your credentials. You can set up everything on our site and then you provide a simple download link to your team member.
  • Simplify your Composer build process. Use our own command line tool to download the vendor folder as binary. This makes your build process faster and you don't need to expose your credentials for private repositories.
Please rate this library. Is it a good library?

Informations about the package titanic_php

Rubix ML - Titanic - Machine Learning from Disaster

Content

An example Rubix ML project that predicts which passengers survived the Titanic shipwreck using a Random Forest clasiffier and a very famous dataset from a Kaggle competition. In this tutorial, you'll learn about classification and advanced preprocessing techniques. By the end of the tutorial, you'll be able to submit your own predictions to the Kaggle competition.

From Kaggle:

This is the legendary Titanic ML competition – the best, first challenge for you to dive into ML competitions

The competition is simple: use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.

In this competition, you’ll gain access to two similar datasets that include passenger information like name, age, gender, socio-economic class, etc. One dataset is titled train.csv and the other is titled test.csv.

Train.csv will contain the details of a subset of the passengers on board (891 to be exact) and importantly, will reveal whether they survived or not, also known as the “ground truth”.

The test.csv dataset contains similar information but does not disclose the “ground truth” for each passenger. It’s your job to predict these outcomes.

Using the patterns you find in the train.csv data, predict whether the other 418 passengers on board (found in test.csv) survived.

Installation

Clone the project locally using Composer:

Requirements

Recommended

Tutorial

Introduction

Kaggle is a platform that allows you to test your data science skills by engaging with contests. This is the legendary Titanic ML competition – the best, first challenge for you to dive into ML competitions and familiarize yourself with how the Kaggle platform works. The competition is simple: use machine learning to create a model that predicts which passengers survived the Titanic shipwreck. The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew. While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others. In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).

We'll choose Random Forest as our learner since it offers good performance and is capable of handling both categorical and continuous features.

Note: The source code for this example can be found in the train.php and in predict.php file in project root.

Script desription

The script is separated into two parts:

The training data are given to us in train.csv which has features and labels for training the model. We train the model from the whole dataset, because our testing data test.csv are unlabeled, so in this case we can only validate predictions with Kaggle competition.

Extracting the Data

Each feature is defined by column in train.csv. For our purpose we only choose preferable features with the most informative value for our model. These are continuos and categorical. For extraction from train.csv to dataset object we use Column Picker. As the last extracted feature we name our target (label) feature Survived.

Preprocessing the training Data

As in the *.csv file are missing values, we need to preprocess them for use with MissingDataImputer. For this purpose we use LambdaFunction in which we pass mapping function $toPlaceholder.

The target values in train.csv are 0 and 1. Our training model can handle it as floating number so we should map these as categorical variable Dead and Survived.

For numerical variables we transform data with MinMaxNormalize.For categorical variable we use OneHotEncoder. For these two transformers and for MissingDataImputer we instantiate new objects.

Finally we create the Labeled dataset and fit with our preprocessing functions.

Saving Transformers

Now because we want to apply the same fitted preprocessing on testing dataset test.csv and predicting part will be realized with separated script predict.php, we need to save our fitted transformers into serialized objects. For this purpose we create new Filesystem objects with using RBX file format.

Model training

After we have prepared our data, we can train our predicting model. As estimator we use RandomForest which is an ensemble of ClassificationTrees which is good suited for our relatively small dataset.

Saving the estimator

Finally we save our predicting model for use with predict.php script. As in case with transformers we use Filesystem object with using RBX file format again. But now instead of serializing we use for saving predictive model PersistentModel object. For secure of overwriting existing model, we ask user for saving the new trained model.

Now we have finished our training part train.php, which we execute by calling it from the command line.

Now we can move on creating predicting part predict.php

Extracting the test Data

For predicting part we need to extract our test data which don't contain labels. By extracting we name the same features as for training set, but we omit the target Survived.

Loading transformers

For transforming our test dataset we need to use the transformation fitted on our training dataset. So we load and deserialize our previously saved persistors.

Preprocessing the test Data

For testing data we need to create new Unlabeled dataset object in which we pass our $extractor. As we have loaded our fitted transformers, we can apply them on this dataset object. As in case of training data we use function $toPlaceholder to map our missing values so MissingDataImputer can handle missing values.

Loading estimator

Now we can load our persisted RandomForest estimator into our script using the static load() method.

Making predictions

For making predictions on our testing unlabeled dataset we call the predict() method on our loaded estimator. We store our predicted classes under $predictions variable.

Saving predictions

Now we need to prepare our stored predictions into required format so we can submit it to Kaggle competition.

Firstly we map back our labels into 1 and O. For this we create function bin_mapper which we pass as parameter into built-in php function array_map.

Now we extract PassengerId column from test.csv. Now we create array $ids for column PassengerId. We apply array_unshift function on both columns. Next we instantiate CSV file predictions.csv and finaly export our two columns of data into it with array_transpose function.

Now we can run our prediction script by calling it from the command line.

After succesfully generating file predictions.csv we can submit it to our [Kaggle competition] (https://www.kaggle.com/competitions/titanic) and look at our result in the public leaderboard.

Conclusion

This tutorial describes the whole process of machine learning predicting with RubixML php library. We can take this example as a starting point to other improvements. For example we can apply advanced feature engineering to obtain more information to train the model. As next we examine the predicting model itself. We can try to tune given hyperparametres of the model or we can prove performance of other classifiers (for example Support vector machine or neural network).

As next activity we can try to deploy our predicting model to our webpage or server, where the visitor can after filling the form with our features get information about his/her situation in case of embarking Titanic.

This example can also serve as a template of workflow which can be apllied on another maching learning problem.

License

The code is licensed CC BY-NC 4.0.


All versions of titanic_php with dependencies

PHP Build Version
Package Version
Requires php Version >=7.4
rubix/ml Version ^2.3
Composer command for our command line client (download client) This client runs in each environment. You don't need a specific PHP version etc. The first 20 API calls are free. Standard composer command

The package jenutka/titanic_php contains the following files

Loading the files please wait ....