# PipeDataset

## Class: PipeDataset

```python
PipeDataset(
    input_dir: Optional[str] = None,
    augmentation: Optional[torchvision.transforms.transforms.Compose] = None,
    samples: Optional[int] = None,
    batch_size: int = 1,
    shuffle: bool = False,
    drop_last: bool = False,
    num_workers: int = 0,
    config: Optional[dict] = None
)
```

### Description

`PipeDataset` is a convenience class for managing datasets and related operations in PyTorch. It uses the `LightlyDataset` class to load and handle the dataset. Additional functionalities such as dataset splitting, updating, random subsetting, and visualizing are also provided.

### Parameters

* `input_dir` (str, optional): The directory where the dataset is located. Default is None.
* `augmentation` (torchvision.transforms.transforms.Compose, optional): The transformations to be applied on the dataset. Default is None.
* `samples` (int, optional): The number of samples to load from the dataset. If None, all samples will be loaded. Default is None.
* `batch_size` (int, default=1): The batch size for the DataLoader.
* `shuffle` (bool, default=False): Whether to shuffle the dataset before loading.
* `drop_last` (bool, default=False): Whether to drop the last incomplete batch if the dataset size is not divisible by the batch size.
* `num_workers` (int, default=0): The number of worker processes for data loading.
* `config` (dict, optional): Configuration dictionary containing the above parameters. Default is None.

### Functions

* **`split(self, ratio: float) -> Tuple[PipeDataset, PipeDataset]`**

  Splits the dataset into two based on the specified ratio.

  **Parameters:**

  * `ratio` (float): The ratio for the first split. The ratio should be between 0 and 1.

  **Returns:**

  * A tuple containing two `PipeDataset` instances, each representing a split of the dataset.

  **Example:**

  ```python
  train_dataset, val_dataset = pipe_dataset.split(0.8)
  ```
* **`update(self, dataset: LightlyDataset) -> None`**

  Updates the current dataset with a new one.

  **Parameters:**

  * `dataset` (LightlyDataset): The new dataset to replace the current dataset.

  **Returns:**

  * None.

  **Example:**

  ```python
  new_dataset = LightlyDataset(input_dir='new_directory')
  pipe_dataset.update(new_dataset)
  ```
* **`__len__(self) -> int`**

  Returns the number of samples in the dataset.

  **Returns:**

  * The number of samples in the dataset.

  **Example:**

  ```python
  num_samples = len(pipe_dataset)
  ```
* **`__getitem__(self, index: int) -> Tuple`**

  Returns a sample from the dataset at the specified index.

  **Parameters:**

  * `index` (int): The index of the desired sample.

  **Returns:**

  * A tuple containing the image, label, and title of the sample at the specified index.

  **Example:**

  ```python
  image, label, title = pipe_dataset[10]
  ```
* **`plot(self, indices: Union[List[int], Tuple[int, ...]]) -> None`**

  Plots the images from the dataset at the specified indices.

  **Parameters:**

  * `indices` (list, tuple): Indices of the images to plot

  **Returns:**&#x4E;one.
* **Example：**

<pre><code><strong>pipe_dataset.plot([10, 20, 30])
</strong></code></pre>

### Returns

An instance of the `PipeDataset` class.

### Use cases

* When you need to load a dataset from a directory for training or testing a model in PyTorch.
* When you want to apply a set of transformations on the dataset.
* When you need to split the dataset into training and validation sets, or update the dataset with a new one.
* When you want to plot the images from the dataset for visualization purposes.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://autossl.gitbook.io/autossl/api/data/pipedataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
